University of Cambridge > Talks.cam > Language Technology Lab Seminars > Can Language Models Learn Truthfulness?

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Can Language Models Learn Truthfulness?

Download to your calendar using vCal

He He, New York University
Thursday 18 January 2024, 15:00-16:00
https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdXVpOXFvdz09.

If you have a question about this talk, please contact Panagiotis Fytas .

Today’s large language models (LLMs) are trained on vast amounts of text from the internet, which contains both factual and misleading information about the world. Can language models discern truth from falsehood in this contradicting data? This talk introduces a hypothesis for how LLMs can model truthfulness. Inspired by the agent-model view of language models, we hypothesize that they can cluster truthful text by modeling a truthful persona: a group of agents that are likely to produce truthful text and share similar features. I will discuss both results on real data and controlled experiments on synthetic data that support the hypothesis. Overall, our findings suggest that models can exploit hierarchical structures in the data to learn abstract concepts like truthfulness.

This talk is part of the Language Technology Lab Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Can Language Models Learn Truthfulness?

📅 Download to calendar (vCal)

👤 Speaker: He He, New York University
📅 Date & Time: Thursday 18 January 2024, 15:00 - 16:00
📍 Venue: https://cam-ac-uk.zoom.us/j/97599459216?pwd=QTRsOWZCOXRTREVnbTJBdXVpOXFvdz09

Questions? Contact Panagiotis Fytas

Abstract

Series This talk is part of the Language Technology Lab Seminars series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Can Language Models Learn Truthfulness?

This talk is included in these lists:

Can Language Models Learn Truthfulness?

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Can Language Models Learn Truthfulness?

This talk is included in these lists:

Other lists

Other talks

Can Language Models Learn Truthfulness?

Abstract

Included in Lists