NLIP reading group: Connecting the Dots Between News Articles
- 👤 Speaker: Diarmuid Ó Séaghdha (University of Cambridge)
- 📅 Date & Time: Thursday 12 May 2011, 12:00 - 13:00
- 📍 Venue: GS15, Computer Laboratory
Abstract
Diarmuid will be kicking off this term’s reading group series with the following paper:
@conference{shahaf2010connecting, title={{Connecting the dots between news articles}}, author={Shahaf, D. and Guestrin, C.}, booktitle={Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining}, pages={623—632}, year={2010}, organization={ACM} }
The process of extracting useful knowledge from large datasets has become one of the most pressing problems in today’s society. The problem spans entire sectors, from scientists to intelligence analysts and web users, all of whom are constantly struggling to keep up with the larger and larger amounts of content published every day. With this much data, it is often easy to miss the big picture.
In this paper, we investigate methods for automatically connecting the dots { providing a structured, easy way to navigate within a new topic and discover hidden connections. We focus on the news domain: given two news articles, our system automatically nds a coherent chain linking them together. For example, it can recover the chain of events starting with the decline of home prices (January 2007), and ending with the ongoing health-care debate.
We formalize the characteristics of a good chain and provide an ecient algorithm (with theoretical guarantees) to connect two xed endpoints. We incorporate user feedback into our framework, allowing the stories to be rened and personalized. Finally, we evaluate our algorithm over real news data. Our user studies demonstrate the algorithm’s eectiveness in helping users understanding the news.
Series This talk is part of the Natural Language Processing Reading Group series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GS15, Computer Laboratory
- Guy Emerson's list
- Natural Language Processing Reading Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Thursday 12 May 2011, 12:00-13:00