Title : 
Fast and scalable NUMA-based thread parallel breadth-first search
         
        
            Author : 
Yasui, Yuichiro ; Fujisawa, Katsuki
         
        
            Author_Institution : 
Kyushu Univ., Fukuoka, Japan
         
        
        
        
        
        
            Abstract : 
The breadth-first search (BFS) is one of the most centric kernels in graph processing. Beamer´s direction-optimizing BFS algorithm, which selects one of two traversal directions at each level, can reduce unnecessary edge traversals. In a previous paper, we presented an efficient BFS for a non-uniform memory access (NUMA)-based system, in which the NUMA architecture was carefully considered. In this paper, we investigate the locality of memory accesses in terms of the communication with remote memories in a BFS for a NUMA system, and describe a fast and highly scalable implementation. Our new implementation achieves performance rates of 174.704 billion edges per second for a Kronecker graph with 233 vertices and 237 edges on two racks of a SGI UV 2000 system with 1,280 threads. The implementations described in this paper achieved the fastest entries for a shared-memory system in the June 2014 and November 2014 Graph500 lists, and produced the most energy-efficient entries in the second, third, and fourth Green Graph500 lists (big data category).
         
        
            Keywords : 
Big Data; graph theory; parallel processing; performance evaluation; power aware computing; search problems; shared memory systems; Beamer direction-optimizing BFS algorithm; Kronecker graph; NUMA-based thread parallel breadth-first search; SGI UV 2000 system; big data category; energy-efficient entries; graph processing; green graph500 lists; nonuniform memory access-based system; performance rates; remote memories; shared-memory system; traversal directions; Approximation algorithms; Benchmark testing; Indexes; Instruction sets; Libraries; Memory management; Sockets; Breadth-first search; Graph algorithm; Graph500 benchmark; NUMA-aware; parallel algorithm;
         
        
        
        
            Conference_Titel : 
High Performance Computing & Simulation (HPCS), 2015 International Conference on
         
        
            Conference_Location : 
Amsterdam
         
        
            Print_ISBN : 
978-1-4673-7812-3
         
        
        
            DOI : 
10.1109/HPCSim.2015.7237065