DocumentCode :
2440403
Title :
Performance evaluation of concurrent collections on high-performance multicore computing systems
Author :
Chandramowlishwaran, Aparna ; Knobe, Kathleen ; Vuduc, Richard
Author_Institution :
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2010
fDate :
19-23 April 2010
Firstpage :
1
Lastpage :
12
Abstract :
This paper is the first extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her computation in terms of application-specific operations, partially-ordered by semantic scheduling constraints. The CnC model is well-suited to expressing asynchronous-parallel algorithms, so we evaluate CnC using two dense linear algebra algorithms in this style for execution on state-of-the-art multicore systems: (i) a recently proposed asynchronous-parallel Cholesky factorization algorithm, (ii) a novel and non-trivial ¿higher-level¿ partly-asynchronous generalized eigensolver for dense symmetric matrices. Given a well-tuned sequential BLAS, our implementations match or exceed competing multithreaded vendor-tuned codes by up to 2.6×. Our evaluation compares with alternative models, including ScaLAPACK with a shared memory MPI, OpenMP, Cilk++, and PLASMA 2.0, on Intel Harpertown, Nehalem, and AMD Barcelona systems. Looking forward, we identify new opportunities to improve the CnC language and runtime scheduling and execution.
Keywords :
eigenvalues and eigenfunctions; matrix decomposition; multi-threading; multiprocessing systems; parallel algorithms; parallel programming; software performance evaluation; AMD Barcelona system; Cilk++; Intel Harpertown; Nehalem; OpenMP; PLASMA 2.0; ScaLAPACK; asynchronous parallel Cholesky factorization algorithms; asynchronous parallel algorithms; concurrent collection performance evaluation; dense symmetric matrices; high performance multicore computing systems; linear algebra algorithms; multithreaded vendor tuned codes; parallel programming; partly asynchronous generalized eigensolver; semantic scheduling constraints; sequential BIAS; shared memory MPI; Concurrent computing; High performance computing; Linear algebra; Multicore processing; Parallel programming; Plasma density; Processor scheduling; Programming profession; Runtime; Symmetric matrices;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on
Conference_Location :
Atlanta, GA
ISSN :
1530-2075
Print_ISBN :
978-1-4244-6442-5
Type :
conf
DOI :
10.1109/IPDPS.2010.5470404
Filename :
5470404
Link To Document :
بازگشت