BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Language and Demographics on Twitter: Inferring Latent User Attrib
 utes from Streaming Communications - Svitlana Volkova\, Johns Hopkins Univ
 ersity
DTSTART:20140912T110000Z
DTEND:20140912T120000Z
UID:TALK54267@talks.cam.ac.uk
CONTACT:Ekaterina Kochmar
DESCRIPTION:Content shared locally within a user’s social network can re
 veal latent attributes of a user. However\, not all attributes are pronoun
 ced equally given similar amounts of content (some attributes are harder t
 o predict). We explore various network structures on Twitter for the predi
 ction of attributes of varying levels of difficulty (gender\, age\, and po
 litical beliefs)\, examining the impact of graph-type and amount of availa
 ble content. We show that even when limited or no self-authored data is av
 ailable\, language from neighbor communications provide sufficient evidenc
 e for prediction. We find that a friend graph leads to highest accuracy fo
 r gender\, while a follower-graph is preferred for age\, and a retweet-gra
 ph is best for political belief classification.\n\nHowever\, the above mod
 els for social media personal analytics assume access to thousands of mess
 ages per user\, even though most users author content only sporadically ov
 er time. Given this sparsity\, we: (i) leverage content from the local nei
 ghborhood of a user and (ii) estimate the amount of time and tweets requir
 ed for a dynamic model to predict user preferences. When updating our dyna
 mic models over time\, we find that political beliefs can be often predict
 ed using roughly 100 tweets\, depending on the context of user selection\,
  where this could mean hours\, or weeks\, based on the author’s tweeting
  frequency.
LOCATION:FW26\, Computer Laboratory
END:VEVENT
END:VCALENDAR
