Title :
Leveraging memory mapping for fast and scalable graph computation on a PC
Author :
Zhiyuan Lin ; Chau, Duen Horng Polo ; Kang, U.
Author_Institution :
Coll. of Comput., Georgia Tech, Atlanta, GA, USA
Abstract :
Large graphs with billions of nodes and edges are increasingly common, calling for new kinds of scalable computation frameworks. Although popular, distributed approaches can be expensive to build, or require many resources to manage or tune. State-of-the-art approaches such as GraphChi and TurboGraph recently have demonstrated that a single machine can efficiently perform advanced computation on billion-node graphs. Although fast, they both use sophisticated data structures, memory management, and optimization techniques. We propose a minimalist approach that forgoes such complexities, by leveraging the memory mapping capability found on operating systems. Our experiments on large datasets, such as a 1.5 billion edge Twitter graph, show that our streamlined approach achieves up to 26 times faster than GraphChi, and comparable to TurboGraph. We contribute our crucial insight that by leveraging memory mapping, a fundamental operating system capability, we can outperform the latest graph computation techniques.
Keywords :
data mining; data structures; graph theory; operating systems (computers); optimisation; storage management; GraphChi; PC; TurboGraph; Twitter graph; billion-node graphs; data structures; graph computation techniques; large datasets; memory management; memory mapping capability; minimalist approach; operating system capability; operating systems; optimization techniques; scalable computation frameworks; Data mining; Libraries; Memory management; Operating systems; Optimization; Random access memory; Twitter; graph mining; memory mapping; scalable algorithms; single machine;
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
DOI :
10.1109/BigData.2013.6691739