BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Semi-supervised Training of a Statistical Parser from Unlabeled Pa
 rtially-bracketed Data - John Carroll - Department of Informatics\, Univer
 sity of Sussex
DTSTART:20070615T140000Z
DTEND:20070615T150000Z
UID:TALK7469@talks.cam.ac.uk
CONTACT:5968
DESCRIPTION:We compare the accuracy of a statistical parse ranking model t
 rained\n  from a fully-annotated portion of the Susanne treebank with one\
 n  trained from unlabeled partially-bracketed sentences derived from\n  th
 is treebank and from the Penn Treebank. We demonstrate that \n    confiden
 ce-based semi-supervised techniques similar to\n  self-training outperform
  expectation maximization when both are\n  constrained by partial bracketi
 ng. Both methods based on\n  partially-bracketed training data outperform 
 the fully supervised\n  technique\, and both can\, in principle\, be appli
 ed to any statistical\n  parser whose output is consistent with such parti
 al-bracketing. We\n  also explore tuning the model to a different domain a
 nd the effect\n  of in-domain data in the semi-supervised training process
 es.
LOCATION:SW01 Computer Laboratory
END:VEVENT
END:VCALENDAR
