University of Cambridge > Talks.cam > Machine Learning @ CUED > Optimal Control and Reinforcement Learning with Gaussian Process Models

Optimal Control and Reinforcement Learning with Gaussian Process Models

Download to your calendar using vCal

If you have a question about this talk, please contact Zoubin Ghahramani .

Optimal control and reinforcement learning (RL) have the same objective: optimization of a long-term performance measure. While the system in optimal control problems is usually known, RL has a more general setup, which includes possibly unknown environments. However, after learning a model standard algorithms for optimal control can also be applied to RL.

In this talk a generalization of dynamic programming (DP) to continuous-valued state and action spaces is given. The proposed algorithm (GPDP) combines Gaussian process (GP) models with DP and yields an approximate optimal closed-loop policy on the entire state space. We apply GPDP to the underactuated pendulum swing up. For exactly known environments we show that GPDP yields an close-to optimal solution. Moreover, we show that GPDP can successfully be applied to stochastic optimal control problems.

This talk is part of the Machine Learning @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity