University of Cambridge > Talks.cam > Machine Learning Reading Group @ CUED > Offline Reinforcement Learning

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Offline Reinforcement Learning

Download to your calendar using vCal

Max Patacchiola (University of Cambridge), Stephen Chung (University of Cambridge), Adam Jelley (University of Edinburgh)
Wednesday 15 February 2023, 11:00-12:30
Cambridge University Engineering Department, CBL Seminar room BE4-38..

If you have a question about this talk, please contact James Allingham .

Zoom link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] lists.cam.ac.uk). Sign up to our mailing list for easier reminders.

In the first part of the talk we will introduce the common terms used in standard online RL. After that we will define the offline RL setting, describing applications and benchmarks. We will then focus on behavioural cloning (BC), as a simple and stable baseline for learning a policy from offline interaction data. As a particular instance of BC, we will describe the decision transformer, a recently proposed method that leverages the transformer architecture to tackle the offline RL setting. In the second part of the talk, we will explore how off-policy RL algorithms originally designed for the online setting (such as SAC ) can be adapted to better handle the necessary distribution shift required for improving on the policy in the offline data, without online feedback. We will find that this reduces to a problem of quantifying and managing uncertainty. In the third and last part of the talk, we will first review the classical offline reinforcement learning methods, including ways to evaluate and improve policies using offline data by importance sampling. The challenges and applicability of these methods will be discussed. Then, we will review modern offline RL methods, including policy constraint methods and model-based offline RL methods. In policy constraint methods, we encourage the new policy to be similar to the policy observed in the offline dataset, while in model-based offline RL methods, we quantify the uncertainty of the model and use the uncertainty to discourage the new policy from visiting those uncertain regions.

References:

Levine, S., Kumar, A., Tucker, G., & Fu, J. (2020). Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643.

Chen, L., Lu, K., Rajeswaran, A., Lee, K., Grover, A., Laskin, M., ... & Mordatch, I. (2021). Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34, 15084-15097.

Fujimoto, S., & Gu, S. S. (2021). A minimalist approach to offline reinforcement learning. Advances in neural information processing systems, 34, 20132-20145.

An, G., Moon, S., Kim, J. H., & Song, H. O. (2021). Uncertainty-based offline reinforcement learning with diversified q-ensemble. Advances in neural information processing systems, 34, 7436-7447.

This talk is part of the Machine Learning Reading Group @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Offline Reinforcement Learning

📅 Download to calendar (vCal)

⚠️ Important: Zoom link available upon request (it is sent out on our mailing list, eng-mlg-rcc [at] lists.cam.ac.uk). Sign up to our mailing list for easier reminders.

👤 Speaker: Max Patacchiola (University of Cambridge), Stephen Chung (University of Cambridge), Adam Jelley (University of Edinburgh)
📅 Date & Time: Wednesday 15 February 2023, 11:00 - 12:30
📍 Venue: Cambridge University Engineering Department, CBL Seminar room BE4-38.

Questions? Contact James Allingham

Abstract

References:

Levine, S., Kumar, A., Tucker, G., & Fu, J. (2020). Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643.

Fujimoto, S., & Gu, S. S. (2021). A minimalist approach to offline reinforcement learning. Advances in neural information processing systems, 34, 20132-20145.

An, G., Moon, S., Kim, J. H., & Song, H. O. (2021). Uncertainty-based offline reinforcement learning with diversified q-ensemble. Advances in neural information processing systems, 34, 7436-7447.

Series This talk is part of the Machine Learning Reading Group @ CUED series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Offline Reinforcement Learning

This talk is included in these lists:

Offline Reinforcement Learning

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Offline Reinforcement Learning

This talk is included in these lists:

Other lists

Other talks

Offline Reinforcement Learning

Abstract

Included in Lists