BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Computational Neuroscience Journal Club - Guillaume Hennequin and 
 Kris Jensen
DTSTART:20211008T130000Z
DTEND:20211008T143000Z
UID:TALK163717@talks.cam.ac.uk
CONTACT:Jake Stroud
DESCRIPTION:Please join us for our fortnightly journal club online via zoo
 m where two presenters will jointly present a topic together. The next top
 ic is ‘Policy-gradient reinforcement learning’ presented by Guillaume 
 Hennequin and Kris Jensen.\n\nZoom information: https://us02web.zoom.us/j/
 84958321096?pwd=dFpsYnpJYWVNeHlJbEFKbW1OTzFiQT09 Meeting ID: 841 9788 6178
  Passcode: 659046\n\nSummary:\nHumans and animals continually learn from i
 nteracting with their environment in a paradigm commonly known as reinforc
 ement learning. In the neuroscience literature\, this is often phrased in 
 the context of Q learning or temporal difference learning where decisions 
 are made on the basis of the learned values of every state and action. In 
 this journal club we focus on an alternative approach to reinforcement lea
 rning where a policy is instead learned by direct optimization of the futu
 re expected reward. We start with an introduction to such ‘policy gradie
 nt’ reinforcement learning by deriving the canonical ‘REINFORCE’ alg
 orithm and giving an overview of techniques used to reduce variance and st
 abilize learning. We then discuss how such policy gradient methods could p
 otentially be implemented in biological circuits using well-known synaptic
  plasticity rules. Finally we consider a case study of how policy gradient
  methods can be used to model biological agents and provide insights into 
 the structure and function of neural circuits.\n\nRelevant reading:\nLevin
 e (2021). Berkeley CS 285 Lecture 5 notes (introduction to policy gradient
  methods and variance reduction). http://rail.eecs.berkeley.edu/deeprlcour
 se/static/slides/lec-5.pdf.\n\nFremaux et al. (2010). “Functional Requir
 ements for Reward-Modulated Spike-Timing-Dependent Plasticity.” https://
 www.jneurosci.org/content/30/40/13326.\n\nWang & Kurth-Nelson et al. (2018
 ). “Prefrontal cortex as a meta-reinforcement learning system.” https:
 //www.nature.com/articles/s41593-018-0147-8.
LOCATION:Online on Zoom
END:VEVENT
END:VCALENDAR