DocumentCode :
579760
Title :
Global Data Re-allocation via Communication Aggregation in Chapel
Author :
Sanz, Alberto ; Asenjo, Rafael ; López, Juan ; Larrosa, Rafael ; Navarro, Angeles ; Litvinov, Vassily ; Choi, Sung-Eun ; Chamberlain, Bradford L.
Author_Institution :
Dept. of Comput. Archit., Univ. of Malaga, Malaga, Spain
fYear :
2012
fDate :
24-26 Oct. 2012
Firstpage :
235
Lastpage :
242
Abstract :
Chapel is a parallel programming language designed to improve the productivity and ease of use of conventional and parallel computers. This language currently delivers sub optimal performance when executing codes that perform global data re-allocation operations on distributed memory architectures. This is mainly due to data communication that is done without aggregation (one message for each remote array element). In this work, we analyze Chapel´s standard Block and Cyclic distribution modules and optimize the communication routines for array assignments by performing aggregation. Thanks to the expressive power of Chapel, the compiler and runtime have enough information to do communication aggregation without user intervention. The runtime relies on the low-level GAS Net networking layer, whose versions of one-sided bulk put/get routines that support strides are particularly useful for us. Experimental results conducted on Hector (a Cray XE6) and Jaguar (Cray XK6)reveal that the implemented techniques can lead to significant reductions in communication time.
Keywords :
data communication; data handling; parallel languages; parallel programming; program compilers; Chapel standard block and cyclic distribution modules; Cray XE6; Cray XK6; Hector; Jaguar; array assignment; code execution; communication aggregation; communication routine optimization; communication time reduction; compiler; data communication; distributed memory architecture; global data reallocation operation; low-level GAS Net networking layer; one-sided bulk put-get routine; parallel computer; parallel programming language; productivity; remote array element; runtime; suboptimal performance; user intervention; Arrays; Indexes; Productivity; Reactive power; Runtime; Standards; Block distribution; Chapel; communication aggregation; cyclic distribution; data re-distribution; one-sided communications;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
Conference_Location :
New York, NY
ISSN :
1550-6533
Print_ISBN :
978-1-4673-4790-7
Type :
conf
DOI :
10.1109/SBAC-PAD.2012.18
Filename :
6374794
Link To Document :
بازگشت