Topic Modeling: Beyond Bag-of-Words
- đ¤ Speaker: Hanna Wallach, Inference Group
- đ Date & Time: Wednesday 22 February 2006, 15:00 - 16:00
- đ Venue: Ryle Seminar Room, Cavendish Laboratory
Abstract
Some models of textual corpora employ text generation methods involving n-gram statistics, while others use latent topic variables inferred using the ``bag-of-words’’ assumption, in which word order is ignored. Previously, these methods have not been combined. In this talk, I present a hierarchical generative probabilistic model that incorporates both $n$-gram statistics and latent topic variables, by extending a unigram topic model to include properties of a hierarchical Dirichlet bigram language model.
Series This talk is part of the Inference Group series.
Included in Lists
- All Cavendish Laboratory Seminars
- All Talks (aka the CURE list)
- Biology
- Cambridge Neuroscience Seminars
- Cambridge talks
- Centre for Health Leadership and Enterprise
- Chris Davis' list
- dh539
- dh539
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group
- Inference Group Summary
- Interested Talks
- Joint Machine Learning Seminars
- Life Science
- Life Sciences
- Machine Learning Summary
- ME Seminar
- ML
- Neurons, Fake News, DNA and your iPhone: The Mathematics of Information
- Neuroscience
- Neuroscience Seminars
- Neuroscience Seminars
- Required lists for MLG
- rp587
- Ryle Seminar Room, Cavendish Laboratory
- School of Physical Sciences
- Stem Cells & Regenerative Medicine
- Thin Film Magnetic Talks
- yk373's list
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Hanna Wallach, Inference Group
Wednesday 22 February 2006, 15:00-16:00