BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Predicting Faults in Heterogeneous\, Federated Distributed Systems
  - Marco Canini (EPFL)
DTSTART:20100929T133000Z
DTEND:20100929T143000Z
UID:TALK26750@talks.cam.ac.uk
CONTACT:Eiko Yoneki
DESCRIPTION:It is notoriously difficult to make distributed systems reliab
 le. This becomes even harder in the case of the widely-deployed systems th
 at become heterogeneous and federated. The set of routers in charge of the
  inter-domain routing in the Internet is a prime example of such a system.
  The unanticipated interaction of nodes under seemingly valid configuratio
 n changes and local fault-handling can have a profound effect. For example
 \, the Internet has suffered from multiple IP prefix hijackings\, as well 
 as performance and reliability problems due to emergent behavior resulting
  from a local session reset. \n\nWe argue that the key step in making thes
 e systems reliable is the need to automatically predict faults. In this ta
 lk\, I will describe the design and implementation of DiCE\, a system that
  uses temporal and spatial awareness to predict faults in heterogeneous\, 
 federated systems. Our live evaluation in the testbed shows that DiCE quic
 kly and successfully predicts two important classes of faults\, operator m
 istakes and programming errors\, that have plagued BGP routing in the Inte
 rnet.\n\nJoint work with Vojin Jovanovic\, Gautam Kumar\, and Dejan Kostic
 \n\nMarco's home page: http://people.epfl.ch/marco.canini\n
LOCATION:FW26\, Computer Laboratory\, William Gates Builiding
END:VEVENT
END:VCALENDAR
