A Look at Partial Projections for Regression onto Text
- đ¤ Speaker: Matt Taddy (University of Chicago)
- đ Date & Time: Friday 28 May 2010, 16:00 - 17:00
- đ Venue: MR12, CMS, Wilberforce Road, Cambridge, CB3 0WB
Abstract
An increasingly common problem in data analysis is to infer the relationship between text and characteristics of the speaker or document source. Various modifications of the multinomial bag-of-words model are most prominent among approaches designed specifically for text regression, although many generic high-dimensional pattern recognition techniques are also applicable. We investigate one such generic technique, partial least-squares (PLS), which is commonly used in engineering and physical sciences. This inquiry is motivated by the discovery that ``slant-measure’’, a heuristic from political economics for regressing ideology onto text, is just the first PLS direction. Our goal is to provide a Bayesian analysis scheme for text regression which takes advantage of the mechanics (and initial economic motivation) of PLS , and to this end we devise model-based interpretations of the algorithm and adapt these to account for the specifics of text-count covariate matrices. Results are provided in the motivating application of ideology analysis for the 109th US Congress.
Series This talk is part of the Statistics series.
Included in Lists
- All CMS events
- All Talks (aka the CURE list)
- bld31
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- CMS Events
- custom
- DPMMS info aggregator
- DPMMS lists
- DPMMS Lists
- Guy Emerson's list
- Hanchen DaDaDash
- Interested Talks
- Machine Learning
- MR12, CMS, Wilberforce Road, Cambridge, CB3 0WB
- rp587
- School of Physical Sciences
- Statistical Laboratory info aggregator
- Statistics
- Statistics Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Matt Taddy (University of Chicago)
Friday 28 May 2010, 16:00-17:00