BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Language Modelling with Phonemes - Zebulon Youra Goriely (Universi
 ty of Cambridge)
DTSTART:20241025T110000Z
DTEND:20241025T120000Z
UID:TALK222613@talks.cam.ac.uk
CONTACT:Suchir Salhan
DESCRIPTION:The statistical properties of language and how they may be use
 d in language processing and language acquisition have been studied for ma
 ny decades. Recently\, large language models have demonstrated striking la
 nguage-learning capabilities\, providing evidence for the “richness” o
 f the linguistic stimulus\, but are often trained on data that seems cogni
 tively implausible both in terms of quantity (thousands of human-lifetimes
 ) and quality (written text\, internet sources). For these models to help 
 us study language\, we must think far more carefully about the plausibilit
 y of the input – using phonemes instead of letters\, using spoken source
 s\, and reducing the quantity. We must then determine whether the architec
 tures we use are suitable at this scale and input representation. These mo
 dels can then give us valuable analytical insights about the statistical p
 roperties of language and the learnability of language\, as well as giving
  us practical benefits for tasks associated with language modelling and la
 nguage understanding.\n\n*Speaker Biography*\n\nZebulon Goriely is a fourt
 h-year PhD student working on Transformer Language Models and Child Langua
 ge Acquisition\, supervised by Professor Paula Buttery.
LOCATION:Zoom link: https://cam-ac-uk.zoom.us/j/4751389294?pwd=Z2ZOSDk0eG1
 wZldVWG1GVVhrTzFIZz09
END:VEVENT
END:VCALENDAR
