Applied Probabilistic Algorithms for Big Data Analysis
- đ¤ Speaker: Advait Sarkar (University of Cambridge)
- đ Date & Time: Tuesday 03 November 2015, 13:00 - 14:00
- đ Venue: Computer Laboratory, William Gates Building, Room SW01
Abstract
Introductory algorithms courses encourage us to think of computers as perfect machines that calculate exact answers. We typically design programs to provide exactly this type of perfection. However, it is possible to construct efficient algorithms by relaxing the zero error constraint. The demand for space and time resources can be drastically reduced in exchange of a small, quantifiable probability of error.
In this lecture, we will follow the journey of MildlyInappropriateCatAppreciationSociety.com and its competitors as they try to tackle some of the problems of managing large amounts of cat-related data. Motivated by examples and terrible cat puns, you will learn 5 probabilistic techniques that allow you do things such as:- efficiently test whether an item is already present in a gigantic distributed database
- efficiently count the number of distinct items in said big database
- efficiently tabulate the frequencies of different items in said big database
You will learn these techniques and their error bounds in sufficient detail that you will be able to implement them once the lecture is finished. They can all be implemented in a few dozen lines of code!
Series This talk is part of the Research Students Lecture Series series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge talks
- Computer Education Research
- Computer Laboratory, William Gates Building, Room SW01
- Computing Education Research
- Department of Computer Science and Technology talks and seminars
- Guy Emerson's list
- Interested Talks
- School of Technology
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Tuesday 03 November 2015, 13:00-14:00