BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Integrating Scale Out and Fault Tolerance in Stream Processing usi
 ng Operator State Management - Eva Kalyvianaki (City University London)
DTSTART:20131023T120000Z
DTEND:20131023T130000Z
UID:TALK43914@talks.cam.ac.uk
CONTACT:Eiko Yoneki
DESCRIPTION:As users of “big data” applications expect fresh results\,
  we witness a new breed of stream processing systems (SPS) that are design
 ed to scale to large numbers of cloud-hosted machines. Such systems face n
 ew challenges: (i) to benefit from the “pay-as-you-go” model of cloud 
 computing\, they must scale out on demand\, acquiring additional virtual m
 achines (VMs) and parallelising operators when the workload increases\; (i
 i) failures are common with deployments on hundreds of VMs—systems must 
 be fault-tolerant with fast recovery times\, yet low per-machine overheads
 . An open question is how to achieve these two goals when stream queries i
 nclude stateful operators\, which must be scaled out and recovered without
  affecting query results.\n\nOur key idea is to expose internal operator s
 tate explicitly to the SPS through a set of state management primi- tives.
  Based on them\, we describe an integrated approach for dynamic scale out 
 and recovery of stateful operators. Externalised operator state is checkpo
 inted periodically by the SPS and backed up to upstream VMs. The SPS ident
 ifies individual operator bottlenecks and automatically scales them out by
  allocating new VMs and partitioning the checkpointed state. At any point\
 , failed operators are recovered by restoring checkpointed state on a new 
 VM and replaying unprocessed tuples. We evaluate this approach with the Li
 near Road Benchmark on the Amazon EC2 cloud platform and show that it can 
 scale automatically to a load factor of L=350 with 50 VMs\, while recoveri
 ng quickly from failures.\n\nJoint work with Raul Castro Fernandez\, Matte
 o Migliavacca and Peter Pietzuch. \n\nBio: Eva Kalyvianaki is a lecturer i
 n City University London in the Department of Computer Science. She holds 
 a PhD from Cambridge University and MSc and BSc degrees from the Universit
 y of Crete\, Greece.  Before joining City University she was a post-doctor
 al researcher in Imperial College London. Her research interests span the 
 areas of cloud computing\, real-time query processing\, autonomic computin
 g and systems and performance in general. 
LOCATION:LT2\, Computer Laboratory\, William Gates Builiding
END:VEVENT
END:VCALENDAR
