BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Learning Common Grammar from Multilingual Corpus / Online Multisca
 le Dynamic Topic Models - Dr. Tomoharu Iwata (NTT)
DTSTART:20100721T100000Z
DTEND:20100721T110000Z
UID:TALK25092@talks.cam.ac.uk
CONTACT:Zoubin Ghahramani
DESCRIPTION:I will give the following two talks.\n\n"Learning Common Gramm
 ar from Multilingual Corpus"\n\nWe propose a corpus-based probabilistic fr
 amework to extract hidden common syntax across languages from non-parallel
  multilingual corpora in an unsupervised fashion. For this purpose\, we as
 sume a generative model for multilingual corpora\, where each sentence is 
 generated from a language dependent probabilistic context-free grammar (PC
 FG)\, and these PCFGs are generated from a prior grammar that is common ac
 ross languages. We also develop a variational method for efficient inferen
 ce. Experiments on a non-parallel multilingual corpus of eleven languages 
 demonstrate the feasibility of the proposed method. \n\n"Online Multiscale
  Dynamic Topic Models"\n\nWe propose an online topic model for sequentiall
 y analyzing the time evolution of topics in document collections. Topics n
 aturally evolve with multiple timescales. For example\, some words may be 
 used consistently over one hundred years\, while other words emerge and di
 sappear over periods of a few days. Thus\, in the proposed model\, current
  topic-specific distributions over words are assumed to be generated based
  on the multiscale word distributions of the previous epoch. Considering b
 oth the long-timescale dependency as well as the short-timescale dependenc
 y yields a more robust model. We derive efficient online inference procedu
 res based on a stochastic EM algorithm\, in which the model is sequentiall
 y updated using newly obtained data.
LOCATION:Engineering Department\, CBL Room 438
END:VEVENT
END:VCALENDAR
