DocumentCode :
167307
Title :
ABC2: Adaptively Balancing Computation and Communication in a DSM Cluster of Multicores for Irregular Applications
Author :
Charan Koduru, Sai ; Vora, K. ; Gupta, Rajesh
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, Riverside, CA, USA
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
391
Lastpage :
400
Abstract :
Graph-based applications have become increasingly important in many application domains. The large graph sizes offer data level parallelism at a scale that makes it attractive to run such applications on distributed shared memory (DSM) based modern clusters composed of multicore machines. Our analysis of several graph applications that rely on speculative parallelism or asynchronous parallelism shows that the balance between computation and communication differs between applications. In this paper, we study this balance in the context of DSMs and exploit the multiple cores present in modern multicore machines by creating three kinds of threads which allows us to dynamically balance computation and communication: compute threads to exploit data level parallelism in the computation, fetch threads that replicate data into object-stores before it is accessed by compute threads, and update threads that make results computed by compute threads visible to all compute threads by writing them to DSM. We observe that the best configuration for above mechanisms varies across different inputs in addition to the variation across different applications. To this end, we design ABC2: a runtime algorithm that automatically configures the DSM using simple runtime information such as: observed object prefetch and update queue lengths. This runtime algorithm achieves speedups close to that of the best hand-optimized configurations.
Keywords :
distributed shared memory systems; multi-threading; ABC2; DSM-based clusters; adaptively balancing computation-and-communication; asynchronous parallelism; compute threads; data level parallelism; data replication; distributed shared memory; dynamic balancing; fetch threads; graph sizes; graph-based applications; irregular applications; multicore machines; object prefetch; object-stores; optimized configurations; queue length update; runtime algorithm; runtime information; speculative parallelism; update threads; Adaptation models; Computational modeling; Multicore processing; Parallel processing; Prefetching; Runtime; Asynchronous Parallelism; Clusters; Distributed Shared Memory; Dynamic Adaptive Model; Runtime Monitoring; Speculative Parallelism;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
Type :
conf
DOI :
10.1109/IPDPSW.2014.51
Filename :
6969414
Link To Document :
بازگشت