GraphChi and GraphChi-DB: large-scale graph computation on just a PC
- 👤 Speaker: Aapo Kyrölä (Facebook)
- 📅 Date & Time: Thursday 05 February 2015, 15:00 - 16:00
- 📍 Venue: FW26, Computer Laboratory, William Gates Builiding
Abstract
Current systems for graph computation require a distributed computing cluster to handle very large real-world problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to non-experts.
We present GraphChi (OSDI¹12), a disk-based system for computing efficiently on graphs with billions of edges. By using a well-known method to break large graphs into small parts, and a novel Parallel Sliding Windows algorithm, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumer-level computer. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives.
We build on the basis of Parallel Sliding Windows to propose a new data structure, the Partitioned Adjacency Lists, which we use to design an online graph database, GraphChi-DB. We demonstrate that, on a single PC, GraphChi-DB can process over one hundred thousand graph updates per second, while simultaneously performing computation. GraphChi-DB compares favorable to existing graph databases, particularly on data that is much larger than the available memory.
We will then look at our research in hindsight and discuss what is GraphChi good for, and what it is not optimal for, and also discuss competing systems that have been proposed since. I will conclude with some remarks on open research problems.
Bio: Aapo Kyrola (Ph.D. Carnegie Mellon University, 2014) is a computer scientist, software engineer and startup entrepreneur. The main contribution of his Ph.D. Thesis is GraphChi (OSDI ¹12), a disk-based graph computation system that can handle as large graphs as many distributed systems, but on just a single PC or laptop. Mr. Kyrola also worked in the GraphLab project (UAI 2010, VLDB 2012 ) and proposed a parallel algorithm for L1-regularized minimization problems (ICML 2011). In CMU , he was advised by professors Carlos Guestrin and Guy Blelloch.
Prior to his Ph.D. Studies, in 2000 Mr. Kyrola co-founded Habbo Hotel (habbo.com), a pioneering web-based virtual world that at its peak attracted over 15 million monthly users and 50 million euros of revenue. During his Ph.D. studies, he co-founded another startup that developed a novel automatic activity tracker app Moves for iOS and Android. Moves was chosen as one of the best apps by both Apple and Google in 2013. Facebook acquired Moves in 2014, and now Mr. Kyrola works as a software engineer in Facebook’s London office.
Series This talk is part of the Computer Laboratory Systems Research Group Seminar series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- CL's SRG seminar
- Computer Laboratory Systems Research Group Seminar
- Department of Computer Science and Technology talks and seminars
- FW26, Computer Laboratory, William Gates Builiding
- Interested Talks
- ndk22's list
- ob366-ai4er
- rp587
- School of Technology
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Aapo Kyrölä (Facebook)
Thursday 05 February 2015, 15:00-16:00