DocumentCode :
186371
Title :
Graph processing on GPUs: Where are the bottlenecks?
Author :
Qiumin Xu ; Hyeran Jeon ; Annavaram, Murali
Author_Institution :
Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
fYear :
2014
fDate :
26-28 Oct. 2014
Firstpage :
140
Lastpage :
149
Abstract :
Large graph processing is now a critical component of many data analytics. Graph processing is used from social networking Web sites that provide context-aware services from user connectivity data to medical informatics that diagnose a disease from a given set of symptoms. Graph processing has several inherently parallel computation steps interspersed with synchronization needs. Graphics processing units (GPUs) are being proposed as a power-efficient choice for exploiting the inherent parallelism. There have been several efforts to efficiently map graph applications to GPUs. However, there have not been many characterization studies that provide an in-depth understanding of the interaction between the GPGPU hardware components and graph applications that are mapped to execute on GPUs. In this study, we compiled 12 graph applications and collected the performance and utilization statistics of the core components of GPU while running the applications on both a cycle accurate simulator and a real GPU card. We present detailed application execution characteristics on GPUs. Then, we discuss and suggest several approaches to optimize GPU hardware for enhancing the graph application performance.
Keywords :
data analysis; graph theory; graphics processing units; mathematics computing; parallel processing; synchronisation; GPGPU hardware components; GPU core components; GPU hardware optimization; application execution characteristics; cycle accurate simulator; data analytics; graph application mapping; graph application performance enhancement; graphics processing units; large-graph processing; parallel computation; parallelism; performance statistics; power-efficiency; real GPU card; synchronization; utilization statistics; Computational modeling; Graphics processing units; Hardware; Instruction sets; Kernel; Pipelines; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Workload Characterization (IISWC), 2014 IEEE International Symposium on
Conference_Location :
Raleigh, NC
Print_ISBN :
978-1-4799-6452-9
Type :
conf
DOI :
10.1109/IISWC.2014.6983053
Filename :
6983053
Link To Document :
بازگشت