BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Q-learning and Pontryagin's Minimum Principle - Professor Sean Mey
 n (Director\, Decision &amp\; Control Lab\, CSL ECE UIUC)
DTSTART:20100113T140000Z
DTEND:20100113T150000Z
UID:TALK21985@talks.cam.ac.uk
CONTACT:Dr Ioannis Lestas
DESCRIPTION:"*Q-learning and Pontryagin's Minimum Principle*":https://netf
 iles.uiuc.edu/meyn/www/spm_files/Q2009/Q09.html\n\nQ-learning is a techniq
 ue used to compute an optimal policy for a controlled Markov chain based o
 n observations of the system controlled using a non-optimal policy. It has
  proven to be effective for models with finite state and action space. Thi
 s paper establishes connections between Q-learning and nonlinear control o
 f continuous-time models with general state space and general action space
 . The main contributions are summarized as follows.\n\n    * The starting 
 point is the observation that the "Q-function" appearing in Q-learning alg
 orithms is an extension of the Hamiltonian that appears in the Minimum Pri
 nciple. Based on this observation we introduce the steepest descent Q-lear
 ning (SDQ-learning) algorithm to obtain the optimal approximation of the H
 amiltonian within a prescribed finite-dimensional function class.\n    * A
  transformation of the optimality equations is performed based on the adjo
 int of a resolvent operator. This is used to construct a consistent algori
 thm based on stochastic approximation that requires only causal filtering 
 of the time-series data.\n    * Several examples are presented to illustra
 te the application of these techniques\, including application to distribu
 ted control of multi-agent systems.\n
LOCATION:Cambridge University Engineering Department\, Lecture Theatre 6
END:VEVENT
END:VCALENDAR
