How useful is quantilization for mitigating specification-gaming?
- 👤 Speaker: Speaker to be confirmed
- 📅 Date & Time: Wednesday 22 May 2019, 17:00 - 18:30
- 📍 Venue: Engineering Department, CBL Seminar room BE4-38
Abstract
This week: “How useful is quantilization for mitigating specification-gaming?” by Ryan Carey. Paper available here, published in the ICLR 2019 Safe Machine Learning workshop
If we have a specification that does not perfectly reflect what we care about, there are ways to maximize it which we want to avoid. To mitigate reward hacking (or specification-gaming), we can perform “quantilization, a method that interpolates between imitating demonstrations, and optimizing the proxy objective. If the demonstrations are of adequate quality, and the proxy reward overestimates perfor- mance, then quantilization has better guaranteed performance than other strategies. However, if the proxy reward underestimates performance, then either imitation or optimization will offer the best guarantee.”
As always, there will be free pizza. The first half hour is for stragglers to finish reading.
Invite your friends to join the mailing list (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai), the Facebook group (https://www.facebook.com/groups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/show/index/80932). Details about the next meeting, the week’s topic and other events will be advertised in these places.
Series This talk is part of the Engineering Safe AI series.
Included in Lists
- Cambridge talks
- Chris Davis' list
- Engineering Department, CBL Seminar room BE4-38
- Engineering Safe AI
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Speaker to be confirmed
Wednesday 22 May 2019, 17:00-18:30