University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

Download to your calendar using vCal

Guodong Pang (Rice University)
Thursday 13 November 2025, 10:50-11:30
Seminar Room 1, Newton Institute.

If you have a question about this talk, please contact nobody.

SCLW01 - Bridging Stochastic Control And Reinforcement Learning: Theories and Applications

In this talk, we will present new Jacobi-like relative value iteration (RVI) algorithms for the ergodic risk-sensitive control problem of discrete-time Markov chains, and the associated Q-learning algorithms. In the case of finite state space, we prove the iterates of the new RVI algorithms converge geometrically, and in the case of countable state space, we prove the convergence of the appropriately truncated problem. We employ the entropy variational formula in order to tackle the multiplicative nature of the risk-sensitive Bellman operator, albeit with an additional optimization problem over a corresponding set of probability vectors. We then discuss the entropy-based risk-sensitive Q-learning algorithms corresponding to the existing and new Jacobi-like RVI algorithms. These Q-learning algorithms have two coupled components: the usual Q-function iterates and the new probability iterates arising from the entropy-variational formula. We prove the convergence of the coupled iterates by investigating the multi-scale stochastic approximations for these iterates.

This talk is part of the Isaac Newton Institute Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

📅 Download to calendar (vCal)

⚠️ Important: SCLW01 - Bridging Stochastic Control And Reinforcement Learning: Theories and Applications

👤 Speaker: Guodong Pang (Rice University)
📅 Date & Time: Thursday 13 November 2025, 10:50 - 11:30
📍 Venue: Seminar Room 1, Newton Institute

Questions? Contact the organiser

Abstract

Series This talk is part of the Isaac Newton Institute Seminar Series series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

This talk is included in these lists:

New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

This talk is included in these lists:

Other lists

Other talks

New Relative Value Iteration and Q-Learning Algorithms for Ergodic Risk Sensitive Control of Markov Chains

Abstract

Included in Lists