Nobody Writes Letters Anymore: Helping people make sense of historically significant email collections
- 👤 Speaker: Douglas W. Oard, University of Maryland, USA
- 📅 Date & Time: Friday 04 July 2008, 12:00 - 13:00
- 📍 Venue: Small Lecture Theatre, Computer Laboratory
Abstract
The archivist’s dilemma is that in a world with vastly more information being created, less of what we should keep may reach the archive in forms that we know how to manage. Much of present archival practice rests on four key facts: important records have generally been written on paper, paper records are (reasonably) persistent, paper records require some level of manual description, and the costs of description and preservation necessitate appraisal and selection. We are, however, moving toward a world in which records that are never committed to paper may prove to be ephemeral, digital objects can be (at least to some extent) self-describing, and the economics of appraisal and retention might therefore reverse. Many projects are now working on reliably getting important digital records into the future, so in this talk I’ll focus on what I as the natural next step: helping to at least partially automate description. In order to illustrate how this might be done, I’ll describe joint work with Tamer Elsayed to automatically resolve the identity of people who are mentioned ambiguously (e.g., just by first name) in a collection of email from a failed corporation (Enron). Our results indicate that, at least for people who are well represented in the collection, we can use a generative model to guess the right identity more than 80% of the time. I’ll conclude the talk with a few remarks on our next directions for techniques, evaluation, and additional types of collections to which similar ideas might be applied.
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Small Lecture Theatre, Computer Laboratory
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Douglas W. Oard, University of Maryland, USA
Friday 04 July 2008, 12:00-13:00