Mind the Data
- 👤 Speaker: Noah Smith, University of Washington
- 📅 Date & Time: Thursday 08 June 2023, 16:00 - 17:00
- 📍 Venue: https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdXVpOXFvdz09
Abstract
Today’s mainstream NLP research focuses on general-purpose models that are scaled up to work with extremely large datasets. This direction has had many benefits, evidenced by performance on research benchmarks and by new use cases for AI in general, and language models specifically, imagined by an ever wider community of stakeholders. What I believe is coming next is a strong demand for customization. More people than ever will want to adapt language models to create new applications. To enable them, I believe we need new affordances for working with the most important ingredient for NLP systems: the data. In this talk, I’ll present recent work from my group showing benefits and risks of new methods for data selection, organization, and synthesis. I’ll advocate for a future in which artifacts like language models are developed to support adaptation to unexpected and diverging demands of a wide population of users, who in turn should be empowered to direct models to serve their own interests.
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdXVpOXFvdz09
- Interested Talks
- Language Sciences for Graduate Students
- Language Technology Lab Seminars
- ndk22's list
- ob366-ai4er
- rp587
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Noah Smith, University of Washington
Thursday 08 June 2023, 16:00-17:00