BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Large language models for enabling constructive online conversatio
 ns - Kristina Gligorić\, Stanford University
DTSTART:20240425T150000Z
DTEND:20240425T160000Z
UID:TALK215485@talks.cam.ac.uk
CONTACT:Panagiotis Fytas
DESCRIPTION:NLP systems promise to disrupt society through applications in
  high-stakes social domains. However\, current evaluation and development 
 focus on tasks that are not grounded in specific societal implications\, w
 hich can lead to societal harm. There is a need to evaluate and mitigate t
 he societal harms and\, in doing so\, bridge the gap between the realities
  of application and how models are currently developed.\nIn this talk\, I 
 will present recent work addressing these issues in the domain of online c
 ontent moderation. In the first part\, I will discuss online content moder
 ation to enable constructive conversations about race. Content moderation 
 practices on social media risk silencing the voices of historically margin
 alized groups. We find that both the most recent models and humans disprop
 ortionately flag posts in which users share personal experiences of racism
 . Not only does this censorship hinder the potential of social media to gi
 ve voice to marginalized communities\, but we also find that witnessing su
 ch censorship exacerbates feelings of isolation. We offer a path to reduce
  censorship through a psychologically informed reframing of moderation gui
 delines. These findings reveal how automated content moderation practices 
 can help or hinder this effort in an increasingly diverse nation where onl
 ine interactions are commonplace.\nIn the second part\, I will discuss how
  identified biases in models can be traced to the use-mention distinction\
 , which is the difference between the use of words to convey a speaker's i
 ntent and mention of words for quoting what someone said or pointing out p
 roperties of a word. Computationally modeling the use-mention distinction 
 is crucial for enabling counterspeech to hate and misinformation. Counters
 peech that refutes problematic content mentions harmful language but is no
 t harmful itself. We show that even recent language models fail at disting
 uishing use from mention and that this failure propagates to downstream ta
 sks. We introduce prompting mitigations that teach the use-mention distinc
 tion and show that they reduce these errors.\nFinally\, I will discuss the
  big picture and other recent efforts to address these issues in different
  domains beyond content moderation\, including education\, emotional suppo
 rt\, and public discourse about AI. I will reflect on how\, by doing so\, 
 we can minimize the harms and develop and apply NLP systems for social goo
 d.\n
LOCATION:https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBd
 XVpOXFvdz09
END:VEVENT
END:VCALENDAR