BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Computational Methods for Linking Sets of National Files - Bill Wi
 nkler (U.S. Census Bureau)
DTSTART:20160912T123000Z
DTEND:20160912T133000Z
UID:TALK67311@talks.cam.ac.uk
CONTACT:INI IT
DESCRIPTION:A combination of faster hardware and new computational algorit
 hms makes it  possible to link two or more national files having suitable 
 quasi-identifying  information such as name\, address\, date-of-birth and 
 other non-uniquely  identifying information far faster than methods of a d
 ecade earlier. The methods  (Winkler\, Yancey\, and Porter 2010) were used
  for matching 10^17 pairs (300  million x 300 million) using 40 cpus of an
  SGI machine (with 2006 Itanium chips)  in less than 30 hours during the 2
 010 U.S. Decennial Census. The methods are 50  times as fast as PSwoosh pa
 rallel software (Kawai et al. 2006) from Stanford  University. The methods
  are ~10 times as fast as recent parallel software that  applies new metho
 ds of load balancing (Rahm and Kolb 2013\, Yan et al. 2013\,  Karapiperis 
 and Verykios 2014). This talk will describe how this software  bypasses th
 e needs for system sorts and provides highly optimized  search-retrieval-c
 omparison for a narrow range of situations needed for record  linkage. <br
 ><br>Related Links <ul> <li><a target="_blank" rel="nofollow">https://fcsm
 .sites.usa.gov/files/2014/05/J1_Winkler_2013FCSM.pdf  </a>- describes meth
 ods for clean-up of sets of national files&nbsp\;</li></ul>
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR
