BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:How useful is quantilization for mitigating specification-gaming? 
 - Speaker to be confirmed
DTSTART:20190522T160000Z
DTEND:20190522T173000Z
UID:TALK125371@talks.cam.ac.uk
CONTACT:Adrià Garriga Alonso
DESCRIPTION:This week: “How useful is quantilization for mitigating spec
 ification-gaming?” by Ryan Carey. "Paper available here":https://drive.g
 oogle.com/uc?export=download&id=13qAfOm8McRvXS33MCNH0ia4ApMIClZP9\, publis
 hed in the "ICLR 2019 Safe Machine Learning workshop":https://sites.google
 .com/view/safeml-iclr2019/accepted-papers\n\nIf we have a specification th
 at does not perfectly reflect what we care about\, there are ways to maxim
 ize it which we want to avoid. To mitigate reward hacking (or specificatio
 n-gaming)\, we can perform "quantilization\, a method that interpolates be
 tween imitating demonstrations\, and optimizing the proxy objective. If th
 e demonstrations are of adequate quality\, and the proxy reward overestima
 tes perfor- mance\, then quantilization has better guaranteed performance 
 than other strategies. However\, if the proxy reward underestimates perfor
 mance\, then either imitation or optimization will offer the best guarante
 e."\n\nAs always\, there will be free pizza. The first half hour is for st
 ragglers to finish reading.\n\nInvite your friends to join the mailing lis
 t (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai)\, the Facebook gr
 oup (https://www.facebook.com/groups/1070763633063871) or the talks.cam pa
 ge (https://talks.cam.ac.uk/show/index/80932). Details about the next meet
 ing\, the week’s topic and other events will be advertised in these plac
 es.
LOCATION:Engineering Department\, CBL Seminar room BE4-38
END:VEVENT
END:VCALENDAR