Formal symbolic models for LLMs: pretraining, evaluation and post-training
- 👤 Speaker: Prof. Tal Linzen (NYU & Google)
- 📅 Date & Time: Thursday 12 March 2026, 15:00 - 16:00
- 📍 Venue: https://cam-ac-uk.zoom.us/j/86890624365?pwd=oYGWpY7d5r3JOaUCaJXTD0sRECFxab.1
Abstract
Abstract: Formal symbolic models—ideal versions of the computational problems involved in learning about and interacting with the world—are central in linguistics and cognitive science. These models make it possible to generate unlimited amounts of synthetic data of variable complexity, as well as define verifiably correct outcomes for each instance of the problem. I will discuss three studies that leverage these properties of formal models. First, I will show that by pretraining transformer LLMs on formal languages before training them on natural languages, we can make training both more compute-efficient and more data-efficient overall. Second, I will introduce context-free language recognition as an evaluation task for LLM . I will show that the complexity of the grammar reliably predicts the model’s accuracy on this task, and that even the strongest reasoning models available struggle to perform this task as the complexity of the language increases. Finally, I will demonstrate how symbolic Bayesian models can be used to evaluate and improve LLM abilities to update their probabilistic beliefs when interacting with users.
Bio: Tal Linzen is an Associate Professor of Linguistics and Data Science at New York University and a Staff Research Scientist at Google. He studies the connections between machine learning and human language comprehension and acquisition, as well as cognitively motivated approaches for language model evaluation, post-training, and interpretability.
Series This talk is part of the Language Technology Lab Seminars series.
Included in Lists
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- https://cam-ac-uk.zoom.us/j/86890624365?pwd=oYGWpY7d5r3JOaUCaJXTD0sRECFxab.1
- Interested Talks
- Language Sciences for Graduate Students
- Language Technology Lab Seminars
- ndk22's list
- ob366-ai4er
- rp587
- Simon Baker's List
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Prof. Tal Linzen (NYU & Google)
Thursday 12 March 2026, 15:00-16:00