University of Cambridge > Talks.cam > Language Technology Lab Seminars > Formal symbolic models for LLMs: pretraining, evaluation and post-training

Log in

Google

Microsoft

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Formal symbolic models for LLMs: pretraining, evaluation and post-training

Download to your calendar using vCal

Prof. Tal Linzen (NYU & Google)
Thursday 12 March 2026, 15:00-16:00
https://cam-ac-uk.zoom.us/j/86890624365?pwd=oYGWpY7d5r3JOaUCaJXTD0sRECFxab.1.

If you have a question about this talk, please contact Lucas Resck .

Abstract: Formal symbolic models—ideal versions of the computational problems involved in learning about and interacting with the world—are central in linguistics and cognitive science. These models make it possible to generate unlimited amounts of synthetic data of variable complexity, as well as define verifiably correct outcomes for each instance of the problem. I will discuss three studies that leverage these properties of formal models. First, I will show that by pretraining transformer LLMs on formal languages before training them on natural languages, we can make training both more compute-efficient and more data-efficient overall. Second, I will introduce context-free language recognition as an evaluation task for LLM . I will show that the complexity of the grammar reliably predicts the model’s accuracy on this task, and that even the strongest reasoning models available struggle to perform this task as the complexity of the language increases. Finally, I will demonstrate how symbolic Bayesian models can be used to evaluate and improve LLM abilities to update their probabilistic beliefs when interacting with users.

Bio: Tal Linzen is an Associate Professor of Linguistics and Data Science at New York University and a Staff Research Scientist at Google. He studies the connections between machine learning and human language comprehension and acquisition, as well as cognitively motivated approaches for language model evaluation, post-training, and interpretability.

This talk is part of the Language Technology Lab Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Formal symbolic models for LLMs: pretraining, evaluation and post-training

📅 Download to calendar (vCal)

👤 Speaker: Prof. Tal Linzen (NYU & Google)
📅 Date & Time: Thursday 12 March 2026, 15:00 - 16:00
📍 Venue: https://cam-ac-uk.zoom.us/j/86890624365?pwd=oYGWpY7d5r3JOaUCaJXTD0sRECFxab.1

Questions? Contact Lucas Resck

Abstract

Series This talk is part of the Language Technology Lab Seminars series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Formal symbolic models for LLMs: pretraining, evaluation and post-training

This talk is included in these lists:

Formal symbolic models for LLMs: pretraining, evaluation and post-training

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Formal symbolic models for LLMs: pretraining, evaluation and post-training

This talk is included in these lists:

Other lists

Other talks

Formal symbolic models for LLMs: pretraining, evaluation and post-training

Abstract

Included in Lists