BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Cobra - Building an Internet-Scale Publish/Subscribe System - Pete
 r Pietzuch\, Imperial College London
DTSTART:20080129T143000Z
DTEND:20080129T153000Z
UID:TALK9598@talks.cam.ac.uk
CONTACT:Minor Gordon
DESCRIPTION:Blogs and RSS feeds are becoming increasingly popular. The blo
 gging site LiveJournal has over 11 million user accounts\, and according t
 o one report\, over 1.6 million postings are made to blogs every day. The 
 "Blogosphere" is a new hotbed of Internet-based media that represents a sh
 ift from mostly static content to dynamic\, continuously-updated discussio
 ns. The problem is that finding and tracking blogs with interesting conten
 t is an extremely cumbersome process. In this talk\, I present our work on
  Cobra (Content-Based RSS Aggregator)\, a publish/subscribe system that cr
 awls and filters vast numbers of RSS feeds\, delivering to each user a per
 sonalised feed based on their interests. Cobra consists of a three-tiered 
 network of crawlers that scan web feeds\, filters that match crawled artic
 les to user subscriptions\, and reflectors that provide recently-matching 
 articles on each subscription as an RSS feed\, which can be browsed using 
 a standard RSS reader. I will talk about the design\, implementation and e
 valuation of Cobra in three settings: a dedicated cluster\, the Emulab tes
 tbed and on PlanetLab. I also present our performance study of the Cobra s
 ystem\, demonstrating that the system is able to scale well to support a l
 arge number of source feeds and users\; that the mean update detection lat
 ency is low (bounded by the crawler rate)\; and that an offline service pr
 ovisioning step combined with several performance optimisations are effect
 ive at reducing memory usage and network load.\n
LOCATION:Room FW11\, Computer Laboratory\, William Gates Building
END:VEVENT
END:VCALENDAR
