BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Use of Linguistic Information and Reordering Strategies for Ngram-
  \nbased Statistical Machine Translation - Adria de Gispert\, TALP Researc
 h Centre – Univ. Politecnica de Catalunya (UPC)\,   \nBarcelona\, Spain
DTSTART:20061031T130000Z
DTEND:20061031T140000Z
UID:TALK5813@talks.cam.ac.uk
CONTACT:Dr Marcus Tomalin
DESCRIPTION:This seminar will be devoted to an overview of the experience 
 in statistical machine translation at UPC during recent years. Firstly\, t
 he Ngram-based SMT system will be described\, detailing bilingual unit def
 inition and basic feature functions for a monotone language pair.\nSecondl
 y\, the introduction of linguistic information at various stages will be d
 iscussed\, including word alignment (investigating correlation between Ali
 gnment Error Rate and translation scores)\, bilingual unit segmentation an
 d direct translation modelling. Results on English-to-Spanish verb form cl
 assification will be reviewed\, as well as the impact of morphology reduct
 ion on bilingual N-gram formulation. For language pairs exhibiting less mo
 notone word order\, the reordering strategies implemented will be presente
 d. Particularly\, reordered search involving tuple unfolding and extended 
 monotone search by linguistically-driven reordering rules will be compared
  for Arabic\, Chinese and Spanish-to-English tasks. Finally\, the seminar 
 will conclude outlining general future research directions towards improvi
 ng performance of current state-of-the-art SMT systems.
LOCATION:LR5\, Engineering Department\, Baker Building
END:VEVENT
END:VCALENDAR
