Optimal Bayesian Reinforcement Learning on Trees
- đ¤ Speaker: Philipp Hennig (University of Cambridge)
- đ Date & Time: Monday 18 May 2009, 15:00 - 16:00
- đ Venue: TCM Seminar Room, Cavendish Laboratory, Department of Physics
Abstract
The “Q-Learning” algorithm is the classical solution to the so-called “Optimal” Reinforcement Learning Problem. Q-Learning uses samples of future rewards generated by a non-optimal policy to derive point estimates of the future rewards from the (unknown) optimal policy.
In the first part of this talk, I will show that a Bayesian treatment, in forcing us to explicitly define our assumptions, reveals some interesting aspects of this problem that seem to have been overlooked so far.
In the second part, I will introduce an algorithm that uses Expectation Propagation to generate beliefs over possible future rewards from the optimal policy if the Markov Environment forms a tree (i.e. “Bayesian Q-Learning on trees”) and will show some preliminary results for its application to Game Trees.
Series This talk is part of the Machine Learning Journal Club series.
Included in Lists
- Cambridge talks
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Journal Clubs
- Inference Group Summary
- Interested Talks
- Machine Learning Journal Club
- Machine Learning Summary
- ML
- Quantum Matter Journal Club
- rp587
- TCM Seminar Room, Cavendish Laboratory, Department of Physics
- TQS Journal Clubs
- yk373's list
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Monday 18 May 2009, 15:00-16:00