Unsupervised Multilingual Learning for Morphological Segmentation
- 👤 Speaker: Tom Lippincott
- 📅 Date & Time: Tuesday 02 February 2010, 12:30 - 13:30
- 📍 Venue: GS15, Computer Laboratory
Abstract
At this session of the NLIP Reading Group we’ll be discussing the following paper:
Benjamin Snyder and Regina Barzilay. 2008. Unsupervised Multilingual Learning for Morphological Segmentation. In Proceedings of ACL -08.
Abstract: For centuries, the deep connection between languages has brought about major discoveries about human communication. In this paper we investigate how this powerful source of information can be exploited for unsupervised language learning. In particular, we study the task of morphological segmentation of multiple languages. We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patterns, or abstract morphemes. We apply our model to three Semitic languages: Arabic, Hebrew, Aramaic, as well as to English. Our results demonstrate that learning morphological models in tandem reduces error by up to 24% relative to monolingual models. Furthermore, we provide evidence that our joint model achieves better performance when applied to languages from the same family.
Series This talk is part of the Natural Language Processing Reading Group series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GS15, Computer Laboratory
- Guy Emerson's list
- Natural Language Processing Reading Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Tom Lippincott
Tuesday 02 February 2010, 12:30-13:30