BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Big data integration: challenges and new approaches - Erhard Rahm 
 (Universität Leipzig )
DTSTART:20160914T080000Z
DTEND:20160914T090000Z
UID:TALK67350@talks.cam.ac.uk
CONTACT:INI IT
DESCRIPTION:Data integration is a key challenge for Big Data applications 
 to semantically  enrich and combine large sets of heterogeneous data for e
 nhanced data analysis.  In many cases\, there is also a need to deal with 
 a very high number of data  sources\, e.g.\, product offers from many e-co
 mmerce websites. We will discuss  approaches to deal with the key data int
 egration tasks of (large-scale) entity  resolution and schema matching. In
  particular\, we discuss parallel blocking and  entity resolution on Hadoo
 p platforms together with load balancing techniques to  deal with data ske
 w. We also discuss challenges and recent approaches for  holistic data int
 egration of many data sources\, e.g.\, to create knowledge graphs  or to m
 ake use of huge collections of web tables.
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR
