DocumentCode
579760
Title
Global Data Re-allocation via Communication Aggregation in Chapel
Author
Sanz, Alberto ; Asenjo, Rafael ; López, Juan ; Larrosa, Rafael ; Navarro, Angeles ; Litvinov, Vassily ; Choi, Sung-Eun ; Chamberlain, Bradford L.
Author_Institution
Dept. of Comput. Archit., Univ. of Malaga, Malaga, Spain
fYear
2012
fDate
24-26 Oct. 2012
Firstpage
235
Lastpage
242
Abstract
Chapel is a parallel programming language designed to improve the productivity and ease of use of conventional and parallel computers. This language currently delivers sub optimal performance when executing codes that perform global data re-allocation operations on distributed memory architectures. This is mainly due to data communication that is done without aggregation (one message for each remote array element). In this work, we analyze Chapel´s standard Block and Cyclic distribution modules and optimize the communication routines for array assignments by performing aggregation. Thanks to the expressive power of Chapel, the compiler and runtime have enough information to do communication aggregation without user intervention. The runtime relies on the low-level GAS Net networking layer, whose versions of one-sided bulk put/get routines that support strides are particularly useful for us. Experimental results conducted on Hector (a Cray XE6) and Jaguar (Cray XK6)reveal that the implemented techniques can lead to significant reductions in communication time.
Keywords
data communication; data handling; parallel languages; parallel programming; program compilers; Chapel standard block and cyclic distribution modules; Cray XE6; Cray XK6; Hector; Jaguar; array assignment; code execution; communication aggregation; communication routine optimization; communication time reduction; compiler; data communication; distributed memory architecture; global data reallocation operation; low-level GAS Net networking layer; one-sided bulk put-get routine; parallel computer; parallel programming language; productivity; remote array element; runtime; suboptimal performance; user intervention; Arrays; Indexes; Productivity; Reactive power; Runtime; Standards; Block distribution; Chapel; communication aggregation; cyclic distribution; data re-distribution; one-sided communications;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
Conference_Location
New York, NY
ISSN
1550-6533
Print_ISBN
978-1-4673-4790-7
Type
conf
DOI
10.1109/SBAC-PAD.2012.18
Filename
6374794
Link To Document