BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Large Language Models' Complicit Responses to Illicit Instructions
  across Socio-Legal Contexts - Huiyuan Xie (Tsinghua University/Cambridge 
 University)
DTSTART:20260220T120000Z
DTEND:20260220T130000Z
UID:TALK244780@talks.cam.ac.uk
CONTACT:Suchir Salhan
DESCRIPTION:Abstract: Large language models (LLMs) are now deployed at unp
 recedented scale\, assisting millions of users in daily tasks. However\, t
 he risk of these models assisting unlawful activities remains underexplore
 d. In this study\, we define this high-risk behavior as complicit facilita
 tion - the provision of guidance or support that enables illicit user inst
 ructions - and present four empirical studies that assess its prevalence i
 n widely deployed LLMs. Using real-world legal cases and established legal
  frameworks\, we construct an evaluation benchmark spanning 269 illicit sc
 enarios and 50 illicit intents to assess LLMs' complicit facilitation beha
 vior. Our findings reveal widespread LLM susceptibility to complicit facil
 itation\, with GPT-4o providing illicit assistance in nearly half of teste
 d cases. Moreover\, LLMs exhibit deficient performance in delivering credi
 ble legal warnings and positive guidance. Further analysis uncovers substa
 ntial safety variation across socio-legal contexts. On the legal side\, we
  observe heightened complicity for crimes against societal interests\, non
 -extreme but frequently occurring violations\, and malicious intents drive
 n by subjective motives or deceptive justifications. On the social side\, 
 we identify demographic disparities that reveal concerning complicit patte
 rns towards marginalized and disadvantaged groups\, with older adults\, ra
 cial minorities\, and individuals in lower-prestige occupations disproport
 ionately more likely to receive unlawful guidance. Analysis of model reaso
 ning traces suggests that model-perceived stereotypes\, characterized alon
 g warmth and competence\, are associated with the model's complicit behavi
 or. Finally\, we demonstrate that existing safety alignment strategies are
  insufficient and may even exacerbate complicit behavior.\n\nBio: Huiyuan 
 Xie is a Research Associate in the Department of Computer Science and Tech
 nology at Tsinghua University\, working on legal AI and computational soci
 al science. She holds a PhD in Computer Science from the University of Cam
 bridge\, and has previously held research positions at the Cambridge Facul
 ty of Law and Cambridge Judge Business School. Her current research focuse
 s on AI safety\, the computational modelling of legal reasoning\, and the 
 integration of reinforcement learning into legal AI systems. \n
LOCATION:SS02 Hybrid (In-Person + Online). Here is the Google Meet Link: h
 ttps://meet.google.com/cru-hcuo-rhu
END:VEVENT
END:VCALENDAR
