DocumentCode :
1783372
Title :
Scaling Irregular Applications through Data Aggregation and Software Multithreading
Author :
Morari, Alessandro ; Tumeo, Antonino ; Chavarria-Miranda, D. ; Villa, Oreste ; Valero, M.R.
Author_Institution :
Pacific Northwest Nat. Lab., Richland, WA, USA
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
1126
Lastpage :
1135
Abstract :
Emerging applications in areas such as bioinformatics, data analytics, semantic databases and knowledge discovery employ datasets from tens to hundreds of terabytes. Currently, only distributed memory clusters have enough aggregate space to enable in-memory processing of datasets of this size. However, in addition to large sizes, the data structures used by these new application classes are usually characterized by unpredictable and fine-grained accesses: i.e., they present an irregular behavior. Traditional commodity clusters, instead, exploit cache-based processor and high-bandwidth networks optimized for locality, regular computation and bulk communication. For these reasons, irregular applications are inefficient on these systems, and require custom, hand-coded optimizations to provide scaling in both performance and size. Lightweight software multithreading, which enables tolerating data access latencies by overlapping network communication with computation, and aggregation, which allows reducing overheads and increasing bandwidth utilization by coalescing fine-grained network messages, are key techniques that can speed up the performance of large scale irregular applications on commodity clusters. In this paper we describe GMT (Global Memory and Threading), a runtime system library that couples software multithreading and message aggregation together with a Partitioned Global Address Space (PGAS) data model to enable higher performance and scaling of irregular applications on multi-node systems. We present the architecture of the runtime, explaining how it is designed around these two critical techniques. We show that irregular applications written using our runtime can outperform, even by orders of magnitude, the corresponding applications written using other programming models that do not exploit these techniques.
Keywords :
data models; data structures; multi-threading; software architecture; software libraries; software performance evaluation; GMT; PGAS data model; bandwidth utilization; commodity clusters; custom hand-coded optimization; data access latency; data aggregation; data structures; distributed memory clusters; fine-grained access; fine-grained network messages; global memory and threading; in-memory dataset processing; large scale irregular application performance; lightweight software multithreading; message aggregation; multinode systems; overhead reduction; partitioned global address space data model; performance scaling; runtime architecture; runtime system library; Arrays; Bandwidth; Electronics packaging; Message systems; Multithreading; Runtime; Software; Multithreading; PGAS; aggregation; semantic graph databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location :
Phoenix, AZ
ISSN :
1530-2075
Print_ISBN :
978-1-4799-3799-8
Type :
conf
DOI :
10.1109/IPDPS.2014.117
Filename :
6877341
Link To Document :
بازگشت