Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
- 👤 Speaker: Diarmuid Ó Séaghdha (Computer Laboratory)
- 📅 Date & Time: Wednesday 05 November 2008, 13:00 - 14:00
- 📍 Venue: GS15, Computer Laboratory
Abstract
I’ll be presenting and discussing the following paper from EMNLP 2008 :
Rion Snow, Brendan O’Connor, Daniel Jurafsky and Andrew Y. Ng. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks . Proceedings of EMNLP 2008 .
Abstract:
Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon’s Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. For all five, we show high agreement between Mechani- cal Turk non-expert annotations and existing gold standard labels provided by expert labelers. For the task of affect recognition, we also show that using non-expert labels for training machine learning algorithms can be as effective as using gold standard annotations from experts. We propose a technique for bias correction that significantly improves annotation quality on two tasks. We conclude that many large labeling tasks can be effectively designed and carried out in this method at a fraction of the usual expense.
Series This talk is part of the Natural Language Processing Reading Group series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GS15, Computer Laboratory
- Guy Emerson's list
- Natural Language Processing Reading Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Diarmuid Ó Séaghdha (Computer Laboratory)
Wednesday 05 November 2008, 13:00-14:00