Title :
Hybrid BFS Approach Using Semi-external Memory
Author :
Iwabuchi, Keita ; Sato, Hikaru ; Mizote, Ryo ; Yasui, Yuichiro ; Fujisawa, Katsuki ; Matsuoka, Shingo
Author_Institution :
Tokyo Inst. of Technol., Tokyo, Japan
Abstract :
NVM devices will greatly expand the possibility of processing extremely large-scale graphs that exceed the DRAM capacity of the nodes, however, efficient implementation based on detailed performance analysis of access patterns of unstructured graph kernel on systems that utilize a mixture of DRAM and NVM devices has not been well investigated. We introduce a graph data offloading technique using NVMs that augment the hybrid BFS (Breadth-first search) algorithm widely used in the Graph500 benchmark, and conduct performance analysis to demonstrate the utility of NVMs for unstructured data. Experimental results of a Scale27 problem of a Kronecker graph compliant to the Graph500 benchmark show that our approach maximally sustains 4.22 Giga TEPS (Traversed Edges Per Second), reducing DRAM size by half with only 19.18% performance degradation on a 4-way AMD Opteron 6172 machine heavily equipped with NVM devices. Although direct comparison is difficult, this is significantly greater than the result of 0.05 GTEPS for a SCALE 36 problem by using 1TB of DRAM and 12 TB of NVM as reported by Pearce et al. Although our approach uses higher DRAM to NVM ratio, we show that a good compromise is achievable between performance vs. capacity ratio for processing large-scale graphs. This result as well as detailed performance analysis of the proposed technique suggests that we can process extremely large-scale graphs per node with minimum performance degradation by carefully considering the data structures of a given graph and the access patterns to both DRAM and NVM devices. As a result, our implementation has achieved 4.35 MTEPS/W(Mega TEPS per Watt) and ranked 4th on November 2013 edition of the Green Graph500 list in the Big Data category by using only a single fat server heavily equipped with NVMs.
Keywords :
Big Data; DRAM chips; benchmark testing; data structures; graph theory; memory architecture; performance evaluation; tree searching; 4-way AMD Opteron 6172 machine; DRAM capacity; DRAM devices; Graph500 benchmark; Green Graph500 list; Kronecker graph; NVM devices; Scale27 problem; TEPS; big data category; data structures; graph data offloading technique; hybrid BFS approach; hybrid breadth-first search algorithm; large-scale graph processing; large-scale graphs; performance analysis; performance degradation; semiexternal memory; traversed edges per second; unstructured graph kernel; Arrays; Benchmark testing; Indexes; Nonvolatile memory; Performance analysis; Random access memory; Breadth-first search; Extreme Big Data; Graph algorithms; Memory architecture; NVM;
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
DOI :
10.1109/IPDPSW.2014.189