BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Frequency and cardinality recovery from sketched data: a novel app
 roach bridging Bayesian and frequentist views - Stefano Favaro (University
  of Turin)
DTSTART:20231103T140000Z
DTEND:20231103T150000Z
UID:TALK206014@talks.cam.ac.uk
CONTACT:Qingyuan Zhao
DESCRIPTION:We study how to recover the frequency of a symbol in a large d
 iscrete data set\, using only a (lossy) compressed representation\, or ske
 tch\, of those data obtained via random hashing. \nThis is a classical pro
 blem at the crossroad of computer science and information theory\, with va
 rious algorithms available\, such as the count-min sketch. However\, these
  algorithms often assume that the data are fixed\, leading to overly conse
 rvative and potentially inaccurate estimates when dealing with randomly sa
 mpled data. In this talk\, we consider the sketched data as a random sampl
 e from an unknown distribution\, and then we introduce novel estimators th
 at improve upon existing approaches. Our method combines Bayesian nonparam
 etric and classical (frequentist) perspectives\, addressing their unique l
 imitations to provide a principled and practical solution. Additionally\, 
 we extend our method to address the related but distinct problem of cardin
 ality recovery\, which consists of estimating the total number of distinct
  objects in the data set. We validate our method on synthetic and real dat
 a\, comparing its performance to state-of-the-art alternatives.
LOCATION:MR12\, Centre for Mathematical Sciences
END:VEVENT
END:VCALENDAR
