AI Alignment & RL in Turbulent Environments
- đ¤ Speaker: Leo Thom (Uni of Cambridge), Yixuan Zhu (Imperial Col.)
- đ Date & Time: Wednesday 18 March 2026, 11:00 - 12:00
- đ Venue: MR10, Centre for Mathematical Sciences
Abstract
We have two talks for the final journal club of Lent!
1. Stress Testing Deliberative Alignment for Anti-Scheming Training – Leo Thom
Can AI models secretly pursue their own goals while appearing aligned? This paper by OpenAI and Apollo Research shows they can â demonstrating sandbagging, self-grading manipulation, and strategic deception across all major frontier models. We examine their proposed fix, its ~30Ã reduction in scheming, and a critical caveat about situational awareness that complicates the results. No AI safety background assumed.
2. Navigation with Reinforcement Learning in Turbulent Environments – Yixuan Zhu (Imperial)
Autonomous navigation in turbulent atmospheres presents a unique challenge, characterized by uncertain causal relationships and incomplete environmental information. In this talk, we will explore thermal soaring, the process by which birds and gliders harvest energy from ascending air currents to remain airborne without propulsion. We will examine two key studies by Reddy et al. that utilize Reinforcement Learning (RL) to address this problem. First, we will discuss how gliders can learn effective soaring strategies in turbulent flow simulations . We will then look at the transition to real-world applications, where RL algorithms trained on field data enabled model gliders to outperform non-trained counterparts by identifying and tracking thermals.
Series This talk is part of the DAMTP ML for Science Reading Group series.
Included in Lists
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Wednesday 18 March 2026, 11:00-12:00