BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinfo
 rcement Learning - Adrià Garriga Alonso (University of Cambridge)
DTSTART:20181114T170000Z
DTEND:20181114T190000Z
UID:TALK114901@talks.cam.ac.uk
CONTACT:Adrià Garriga Alonso
DESCRIPTION:One potential approach to the value alignment problem is to bu
 ild for corrigibility: to try to construct a system for which we can modif
 y its operation at any point. This should be the case even if its objectiv
 es are incorrect or would be harmed by the modification.\n\nTo this end\, 
 this week we read "“Dynamic Safe Interruptibility for Decentralized Mult
 i-Agent Reinforcement Learning”":https://arxiv.org/abs/1704.02882\, by E
 l Mahdi El Mhamdi\, Guerraoui\, Hendrikx and Maurer.\nThey extend the noti
 on of interruptibility to multi-agent algorithms: they construct a way of 
 conditioning a multi-agent reinforcement learner such that the agents won'
 t learn to "plan around" interruptions of its operations. In essence\, the
 y will act as if they believed they would never be interrupted. This is ba
 sed on the initial\, single-agent\, case by Armstrong and Orseau [2]\, whi
 ch we won't read this week.\n\n\nThere will be free pizza. At 17:00\, we w
 ill start reading the paper\, mostly individually. At 17:30\, the discussi
 on leader will start going through the paper\, making sure everyone unders
 tands\, and encouraging discussion about its contents and implications.\n\
 nEven if you think you cannot contribute to the conversation\, you should 
 give it a try. Last year we had several people from non-computer-y backgro
 unds\, and others who hadn't thought about alignment before\, that ended u
 p being essential. If you have already read the paper in your own time you
  can come in time for the discussion.\n\nA basic understanding of machine 
 learning is helpful\, but detailed knowledge of the latest techniques is n
 ot required. Each session will have a brief recap of immediate necessary k
 nowledge. The goal of this series is to get people to know more about the 
 existing work in AI research\, and eventually contribute to the field.\n\n
 Invite your friends to join the mailing list (https://lists.cam.ac.uk/mail
 man/listinfo/eng-safe-ai)\, the Facebook group (https://www.facebook.com/g
 roups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/sho
 w/index/80932). Details about the next meeting\, the week's topic and othe
 r events will be advertised in these places.\n\n\n[1] (to read) El Mahdi E
 l Mhamdi\, Guerraoui\, Hendrikx and Maurer. “Dynamic Safe Interruptibili
 ty for Decentralized Multi-Agent Reinforcement Learning”. https://arxiv.
 org/abs/1704.02882\n\n[2] "Safely Interruptible Agents"\, Stuart Armstrong
  and Laurent Orseau. http://intelligence.org/files/Interruptibility.pdf\n
LOCATION:Cambridge University Engineering Department\, CBL Seminar room BE
 4-38
END:VEVENT
END:VCALENDAR
