Automatic Lexical Acquisition from the CHILDES Database
- đ¤ Speaker: Paula Buttery and Anna Korhonen (RCEAL)
- đ Date & Time: Tuesday 24 April 2007, 16:00 - 17:30
- đ Venue: GR-06/07, English Faculty Building
Abstract
Empirical data regarding the syntactic complexity of children’s speech is important for theories of language acquisition. Currently much of this data is absent in the annotated versions of the CHILDES database. In this study, we show that a state-of-the-art subcategorization acquisition system (Preiss et al. 2007) can be used to extract large-scale subcategorization (frequency) information from the (i) child and (ii) child-directed speech within the CHILDES database without any domain-specific tuning. We demonstrate that the acquired information is sufficiently accurate to a) confirm previously reported research findings and b) yield completely new research findings for theoretical language acquisition research. We also report qualitative results which can be used to further improve parsing and lexical acquisition technology for child language data in the future.
Series This talk is part of the RCEAL Tuesday Colloquia series.
Included in Lists
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 24 April 2007, 16:00-17:30