Q-learning and Pontryagin's Minimum Principle
- đ¤ Speaker: Professor Sean Meyn (Director, Decision & Control Lab, CSL ECE UIUC)
- đ Date & Time: Wednesday 13 January 2010, 14:00 - 15:00
- đ Venue: Cambridge University Engineering Department, Lecture Theatre 6
Abstract
Q-learning and Pontryagin’s Minimum Principle
Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It has proven to be effective for models with finite state and action space. This paper establishes connections between Q-learning and nonlinear control of continuous-time models with general state space and general action space. The main contributions are summarized as follows.
- The starting point is the observation that the “Q-function” appearing in Q-learning algorithms is an extension of the Hamiltonian that appears in the Minimum Principle. Based on this observation we introduce the steepest descent Q-learning (SDQ-learning) algorithm to obtain the optimal approximation of the Hamiltonian within a prescribed finite-dimensional function class.
- A transformation of the optimality equations is performed based on the adjoint of a resolvent operator. This is used to construct a consistent algorithm based on stochastic approximation that requires only causal filtering of the time-series data.
- Several examples are presented to illustrate the application of these techniques, including application to distributed control of multi-agent systems.
Series This talk is part of the CUED Control Group Seminars series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Cambridge University Engineering Department, Lecture Theatre 6
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- CUED Control Group Seminars
- Featured lists
- Information Engineering Division seminar list
- Interested Talks
- ndk22's list
- ob366-ai4er
- Probabilistic Systems, Information, and Inference Group Seminars
- rp587
- School of Technology
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Wednesday 13 January 2010, 14:00-15:00