BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY: AI Alignment &amp\; RL in Turbulent Environments - Leo Thom (Uni 
 of Cambridge)\, Yixuan Zhu (Imperial Col.)
DTSTART:20260318T110000Z
DTEND:20260318T120000Z
UID:TALK245869@talks.cam.ac.uk
CONTACT:Liz Tan
DESCRIPTION:We have two talks for the final journal club of Lent! \n\n*1. 
 Stress Testing Deliberative Alignment for Anti-Scheming Training* - _Leo T
 hom _\n\nCan AI models secretly pursue their own goals while appearing ali
 gned? This paper by OpenAI and Apollo Research shows they can — demonstr
 ating sandbagging\, self-grading manipulation\, and strategic deception ac
 ross all major frontier models. We examine their proposed fix\, its ~30× 
 reduction in scheming\, and a critical caveat about situational awareness 
 that complicates the results. No AI safety background assumed.\n\n\n*2. Na
 vigation with Reinforcement Learning in Turbulent Environments* - _Yixuan 
 Zhu (Imperial)_\n\nAutonomous navigation in turbulent atmospheres presents
  a unique challenge\, characterized by uncertain causal relationships and 
 incomplete environmental information. In this talk\, we will explore therm
 al soaring\, the process by which birds and gliders harvest energy from as
 cending air currents to remain airborne without propulsion.\nWe will exami
 ne two key studies by Reddy et al. that utilize Reinforcement Learning (RL
 ) to address this problem. First\, we will discuss how gliders can learn e
 ffective soaring strategies in turbulent flow simulations . We will then l
 ook at the transition to real-world applications\, where RL algorithms tra
 ined on field data enabled model gliders to outperform non-trained counter
 parts by identifying and tracking thermals.\n
LOCATION:MR10\, Centre for Mathematical Sciences
END:VEVENT
END:VCALENDAR