مرکز منطقه ای اطلاع رساني علوم و فناوري - Performance evaluation of concurrent collections on high-performance multicore computing systems

DocumentCode :

2440403

Title :

Performance evaluation of concurrent collections on high-performance multicore computing systems

Author :

Chandramowlishwaran, Aparna ; Knobe, Kathleen ; Vuduc, Richard

Author_Institution :

Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA

fYear :

2010

fDate :

19-23 April 2010

Firstpage :

Lastpage :

Abstract :

This paper is the first extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her computation in terms of application-specific operations, partially-ordered by semantic scheduling constraints. The CnC model is well-suited to expressing asynchronous-parallel algorithms, so we evaluate CnC using two dense linear algebra algorithms in this style for execution on state-of-the-art multicore systems: (i) a recently proposed asynchronous-parallel Cholesky factorization algorithm, (ii) a novel and non-trivial Â¿higher-levelÂ¿ partly-asynchronous generalized eigensolver for dense symmetric matrices. Given a well-tuned sequential BLAS, our implementations match or exceed competing multithreaded vendor-tuned codes by up to 2.6Ã—. Our evaluation compares with alternative models, including ScaLAPACK with a shared memory MPI, OpenMP, Cilk++, and PLASMA 2.0, on Intel Harpertown, Nehalem, and AMD Barcelona systems. Looking forward, we identify new opportunities to improve the CnC language and runtime scheduling and execution.

Keywords :

eigenvalues and eigenfunctions; matrix decomposition; multi-threading; multiprocessing systems; parallel algorithms; parallel programming; software performance evaluation; AMD Barcelona system; Cilk++; Intel Harpertown; Nehalem; OpenMP; PLASMA 2.0; ScaLAPACK; asynchronous parallel Cholesky factorization algorithms; asynchronous parallel algorithms; concurrent collection performance evaluation; dense symmetric matrices; high performance multicore computing systems; linear algebra algorithms; multithreaded vendor tuned codes; parallel programming; partly asynchronous generalized eigensolver; semantic scheduling constraints; sequential BIAS; shared memory MPI; Concurrent computing; High performance computing; Linear algebra; Multicore processing; Parallel programming; Plasma density; Processor scheduling; Programming profession; Runtime; Symmetric matrices;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on

Conference_Location :

Atlanta, GA

ISSN :

1530-2075

Print_ISBN :

978-1-4244-6442-5

Type :

conf

DOI :

10.1109/IPDPS.2010.5470404

Filename :

5470404

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2440403