Five Sources of Biases and Ethical Issues in NLP, and What to Do about Them
- đ¤ Speaker: Dirk Hovy (Bocconi University)
- đ Date & Time: Friday 23 October 2020, 12:00 - 13:00
- đ Venue: Virtual (Zoom)
Abstract
Never before was it so easy to write a powerful NLP system, never before did it have such a potential impact. However, these systems are now increasingly used in applications they were not intended for, by people who treat them as interchangeable black boxes. The results can be simple performance drops, but also systematic biases against various user groups.
In this talk, I will discuss several types of biases that affect NLP models (based on Shah et al. 2020 and Hovy & Spruit, 2016), what their sources are, and potential counter measures. - bias stemming from data, i.e., selection bias (if our texts do not adequately reflect the population we want to study), label bias (if the labels we use are skewed), and semantic bias (the latent stereotypes encoded in embeddings). - biases deriving from the models themselves, i.e., their tendency to amplify any imbalances that are present in the data. - design bias, i.e., the biases arising from our (the practitioners) decisions which topics to explore, which data sets to use, and what to do with them.
As a consequence, we as NLP practitioners suddenly have a new role, in addition to researcher and developer: considering the ethical implications of our systems, and educating the public about the possibilities and limitations of our work.The time of academic innocence is over, and we need to address this newfound responsibility as a community.
For each bias, I will provide real examples and discuss the possible ramifications for a wide range of applications, and the various ways to address and counteract these biases, ranging from simple labeling considerations to new types of models. I conclude with some provocations for future directions.
Reference: - Deven Shah, H. Andrew Schwartz, & Dirk Hovy. 2020. Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview. In Proceedings of ACL . [https://www.aclweb.org/anthology/2020.acl-main.468/] - Dirk Hovy & Shannon L. Spruit. 2016. The Social Impact of Natural Language Processing. [https://www.aclweb.org/anthology/P16-2096.pdf]
Bio: Dirk Hovy is associate professor of computer science at Bocconi University in Milan, Italy. Before that, he was faculty and a postdoc in Copenhagen, got a PhD from USC , and a linguistics masters in Germany. He is interested in the interaction between language, society, and machine learning, or what language can tell us about society, and what computers can tell us about language. He has authored over 50 articles on these topics, including 3 best paper awards. He has organized one conference and several workshops (on abusive language, ethics in NLP , and computational social science). Outside of work, Dirk enjoys cooking, running, and leather-crafting. For updated information, see http://www.dirkhovy.com
https://cl-cam-ac-uk.zoom.us/j/92174303432?pwd=S2NLWE42VmhRdGE0dlRuMXFFb3FOZz09
Meeting ID: 921 7430 3432 Passcode: 137181
Series This talk is part of the NLIP Seminar Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Computer Education Research
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Graduate-Seminars
- Guy Emerson's list
- Interested Talks
- Language Sciences for Graduate Students
- ndk22's list
- NLIP Seminar Series
- ob366-ai4er
- PMRFPS's
- rp587
- School of Technology
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- Virtual (Zoom)
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Friday 23 October 2020, 12:00-13:00