University of Cambridge > Talks.cam > Information Theory Seminar > Breaking the Sample Size Barrier in Reinforcement Learning

Breaking the Sample Size Barrier in Reinforcement Learning

Download to your calendar using vCal

  • UserYuxin Chen, University of Pennsylvania
  • ClockWednesday 12 November 2025, 14:00-15:00
  • HouseMR5, CMS Pavilion A.

If you have a question about this talk, please contact Dr Varun Jog .

Abstract: Emerging reinforcement learning (RL) applications necessitate the design of sample-efficient solutions in order to accommodate the explosive growth of problem dimensionality. Despite the empirical success, however, our understanding about the statistical limits of RL remains highly incomplete. In this talk, I will present some recent progress towards settling the sample complexity limits in RL. The first scenario is concerned with RL with a generative model, which allows one to query arbitrary state-action pairs to draw independent samples. We prove that a model-based algorithm (a.k.a. the plug-in approach) achieves minimal-optimal sample complexity without any burn-in cost. The second scenario is concerned with online RL, where an agent learns via real-time interactions with an unknown environment. We develop the first algorithm — an optimistic model-based algorithm — that achieves minimax-optimal regret for the entire range of sample sizes. Time permitting, we will also discuss the effectiveness of model-based paradigms in offline RL and multi-agent RL. Our results emphasize the prolific interplay between high-dimensional statistics, online learning, and game theory.

The first part is based on joint work with Gen Li, Yuting Wei and Yuejie Chi, and the second part is based on joint work with Zihan Zhang, Jason Lee and Simon Du.

Paper 1: https://arxiv.org/abs/2005.12900 Paper 2: https://yuxinchen2020.github.io/publications/Optimal-OnlineRL.pdf Paper 3: https://arxiv.org/abs/2204.05275

Bio: Yuxin Chen is currently a professor of statistics and data science and of electrical and systems engineering at the University of Pennsylvania. Before joining UPenn, he was an assistant professor of electrical and computer engineering at Princeton University. He completed his Ph.D. in Electrical Engineering at Stanford University and was also a postdoc scholar at Stanford Statistics. His current research interests include high-dimensional statistics, machine learning theory, and optimization. He has received the Alfred P. Sloan Research Fellowship, the SIAM Activity Group on Imaging Science Best Paper Prize, the ICCM Best Paper Award (gold medal), and was selected as a finalist for the Best Paper Prize for Young Researchers in Continuous Optimization. He has also received the Princeton Graduate Mentoring Award.

This talk is part of the Information Theory Seminar series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Š 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity