BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Architectures for large-scale continuous data management - Dionysi
 os Logothetis (Telefonica)
DTSTART:20120706T100000Z
DTEND:20120706T110000Z
UID:TALK38309@talks.cam.ac.uk
CONTACT:Eiko Yoneki
DESCRIPTION:The ability to do rich analytics on massive sets of unstructur
 ed data drives the operation of many organizations today. These “big dat
 a”\nanalytics have given rise to a new class of data-intensive computing
  systems\, like MapReduce\, that can scale to very large data simply by em
 ploying more compute power. While these systems have been very successful\
 , it is becoming apparent that scalability alone is not enough.\nMany anal
 ytics today are update-driven\, and this brute-force approach is inefficie
 nt when trying to keep analytics up-to-date as data change continuously.\n
 \nIn the first part of the talk\, I will present a new approach for progra
 mming analytics that takes the continuous nature of data into consideratio
 n. A fundamental requirement for efficient processing of continuous data i
 s the ability to incrementally update the analytics by maintaining computa
 tion state. I will argue that state should be a first-class abstraction an
 d present Continuous Bulk Processing (CBP)\, a model and architecture that
  integrates data-parallelism for scalability with state for efficient upda
 te-driven analytics. The model lends itself to several analytics\, like in
 cremental algorithms and iterative analysis.\nThrough real-world applicati
 ons\, I will show how the integration of state in the programming model af
 fords several optimizations in the underlying system\, reducing processing
  time and resource usage relative to current practice.\n\nWhile integratin
 g state in the programming model allows efficient incremental programs\, i
 t may be challenging to design incremental algorithms for complex analytic
 s\, like iterative graph mining and machine learning. In the second part\,
  I will talk about ongoing work on a system that can incrementally compute
  this class of analytics in a manner that is transparent to the user.\n\n\
 nBio:\nI am an Associate Researcher with the Telefonica Research lab in Ba
 rcelona\, Spain. I am primarily interested in building systems for large-s
 cale data mining. My broader research interests lie in the areas of data m
 anagement\, cloud computing and distributed systems. I received my PhD in 
 Computer Science from the University of California\, San Diego and Diploma
  in Computer Science & Engineering from the National Technical University 
 of Athens\, Greece.\n
LOCATION:FW26\, Computer Laboratory\, William Gates Builiding
END:VEVENT
END:VCALENDAR
