DocumentCode :
3052601
Title :
A framework for FPGA acceleration of large graph problems: Graphlet counting case study
Author :
Betkaoui, Brahim ; Thomas, David B. ; Luk, Wayne ; Przulj, Natasa
Author_Institution :
Dept. of Comput., Imperial Coll. London, London, UK
fYear :
2011
fDate :
12-14 Dec. 2011
Firstpage :
1
Lastpage :
8
Abstract :
In many application domains, data are represented using large graphs involving millions of vertices and edges. Graph analysis algorithms, such as finding short paths and isomorphic subgraphs, are largely dominated by memory latency. Large cluster-based computing platforms can process graphs efficiently if the graph data can be partitioned, and on a smaller scale partitioning can be used to allocate graphs to low-latency on-chip RAMs in reconfigurable devices. However, there are many graph classes, such as scale-free social networks, which lack the locality to make partitioning graph data an efficient solution to the latency problem and are far too large to fit in on-chip RAMs and caches. In this paper, we present a framework for reconfigurable hardware acceleration of these large-scale graph problems that are difficult to partition and require high-latency off-chip memory storage. Our reconfigurable architecture tolerates off-chip memory latency by using a memory crossbar that connects many parallel identical processing elements to shared off-chip memory, without a traditional cached memory hierarchy. Quantitative comparison between the software and hardware performance of a graphlet counting case-study shows that our hardware implementation outperforms a quad-core software implementation by 10 times for large graphs. This speedup includes all software and IO overhead required, and reduces execution time for this common bioinformatics algorithm from about 2 hours to just 12 minutes. These results demonstrate that our methodology for accelerating graph algorithms is a promising approach for efficient parallel graph processing.
Keywords :
bioinformatics; field programmable gate arrays; graph theory; microprocessor chips; parallel processing; random-access storage; reconfigurable architectures; FPGA acceleration; bioinformatics algorithm; cluster-based computing; graph allocation; graph analysis algorithms; graph data partitioning; graphlet counting case study; high-latency off-chip memory storage; isomorphic subgraphs; large-scale graph problems; memory crossbar; memory latency; off-chip memory latency; on-chip RAM; parallel graph processing; parallel identical processing elements; reconfigurable architecture; reconfigurable devices; short paths; social networks; Algorithm design and analysis; Field programmable gate arrays; Hardware; Memory management; Radiation detectors; Random access memory; Software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Field-Programmable Technology (FPT), 2011 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-1-4577-1741-3
Type :
conf
DOI :
10.1109/FPT.2011.6132667
Filename :
6132667
Link To Document :
بازگشت