BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Learning Syntax with Deep Neural Networks  - Shalom Lappin (Univer
 sity of Gothenburg\, King's College London and Queen Mary University of Lo
 ndon)
DTSTART:20170525T153000Z
DTEND:20170525T173000Z
UID:TALK71859@talks.cam.ac.uk
CONTACT:Giulia Bovolenta
DESCRIPTION:Joint work with Jean-Philippe Bernardy\, University of Gothenb
 urg\n\nWe consider the extent to which different deep neural network (DNN)
  configurations \ncan learn syntactic relations\, by taking up Linzen et a
 l.'s (2016) work on subject-verb \nagreement with LSTM RNNs. We test their
  methods on a much larger corpus than \nthey used (a ~ ~24 million example
  part of the WaCky corpus\, instead of their ~1.35 million example corpus\
 , both drawn from Wikipedia).\n We experiment with several \ndifferent DNN
  architectures (LSTM RNNs\, GRUs\, and CNNs)\, and alternative parameter \
 nsettings for these systems (vocabulary size\, training to test ratio\, nu
 mber of layers\, memory \nsize\, and drop out rate). We also try out our o
 wn unsupervised DNN language model. Our \nresults are broadly compatible w
 ith those that Linzen et al. report. However\, we discovered \nsome intere
 sting\, and in some cases\, surprising features of DNNs and language model
 s in \ntheir performance of the agreement learning task. In particular\, w
 e found that DNNs require \nlarge vocabularies to form substantive lexical
  embeddings in order to learn structural patterns. \nThis finding has sign
 ificant consequences for our understanding of the way in which DNNs \nrepr
 esent syntactic information. We also achieved significantly better accurac
 y with our \nlanguage model for unsupervised prediction of agreement than 
 Linzen et al. report in their \nLM experiments.\n
LOCATION:GR-06/7\, Faculty of English\, 9 West Rd (Sidgwick Site)
END:VEVENT
END:VCALENDAR
