DocumentCode :
3145065
Title :
Exploiting Hierarchical Parallelism Using UPC
Author :
Wang, Lingyuan ; Merchant, Saumil ; El-Ghazawi, Tarek
Author_Institution :
Dept. of Electr. & Comput. Eng., George Washington Univ., Washington, DC, USA
fYear :
2011
fDate :
16-20 May 2011
Firstpage :
1216
Lastpage :
1224
Abstract :
High-Performance Computing (HPC) systems are increasingly moving towards an architecture that is deeply hierarchical. However, the execution model with single-level parallelism embodied in legacy parallel programming models falls short in exploiting the multi-level parallelism opportunities in both hardware architectures and applications. This makes the use of richer execution models imperative in order to fully exploit hierarchical parallelism. Partitioned Global Address Space (PGAS) languages such as Unified Parallel C (UPC) are growing in popularity because of their ability to provide a globally shared address space with locality awareness. While UPC provides a welcome improvement over message passing libraries, users still program with a single level of parallelism in the context of SPMD. In this paper, we explore two explicit hierarchical programming approaches based on UPC to improve programmability and performance on hierarchical architectures. The first approach orchestrates computations on multiple sets of thread groups, the second approach extends UPC with nested, shared memory multi-threading. This paper presents a detailed description of proposed approaches and demonstrates their effectiveness in the context of the NAS Parallel Benchmarks and the Unbalanced Tree Search (UTS). Experimental results indicate that the hierarchical model not only provides greater expressive power but also enhances performance, all three benchmarks exceed the performance of the standard UPC implementations after being incrementally enhanced with hierarchical parallelism.
Keywords :
C language; multi-threading; parallel programming; shared memory systems; software maintenance; tree searching; HPC system; NAS parallel benchmark; PGAS language; Unified Parallel C; hardware architecture; hierarchical architecture; hierarchical parallelism; hierarchical programming; high-performance computing; legacy parallel programming model; multilevel parallelism; partitioned global address space; programmability; single-level parallelism; unbalanced tree search; Electronics packaging; Instruction sets; Parallel processing; Programming; Runtime; Three dimensional displays;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
Conference_Location :
Shanghai
ISSN :
1530-2075
Print_ISBN :
978-1-61284-425-1
Electronic_ISBN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2011.273
Filename :
6008972
Link To Document :
بازگشت