• DocumentCode
    579760
  • Title

    Global Data Re-allocation via Communication Aggregation in Chapel

  • Author

    Sanz, Alberto ; Asenjo, Rafael ; López, Juan ; Larrosa, Rafael ; Navarro, Angeles ; Litvinov, Vassily ; Choi, Sung-Eun ; Chamberlain, Bradford L.

  • Author_Institution
    Dept. of Comput. Archit., Univ. of Malaga, Malaga, Spain
  • fYear
    2012
  • fDate
    24-26 Oct. 2012
  • Firstpage
    235
  • Lastpage
    242
  • Abstract
    Chapel is a parallel programming language designed to improve the productivity and ease of use of conventional and parallel computers. This language currently delivers sub optimal performance when executing codes that perform global data re-allocation operations on distributed memory architectures. This is mainly due to data communication that is done without aggregation (one message for each remote array element). In this work, we analyze Chapel´s standard Block and Cyclic distribution modules and optimize the communication routines for array assignments by performing aggregation. Thanks to the expressive power of Chapel, the compiler and runtime have enough information to do communication aggregation without user intervention. The runtime relies on the low-level GAS Net networking layer, whose versions of one-sided bulk put/get routines that support strides are particularly useful for us. Experimental results conducted on Hector (a Cray XE6) and Jaguar (Cray XK6)reveal that the implemented techniques can lead to significant reductions in communication time.
  • Keywords
    data communication; data handling; parallel languages; parallel programming; program compilers; Chapel standard block and cyclic distribution modules; Cray XE6; Cray XK6; Hector; Jaguar; array assignment; code execution; communication aggregation; communication routine optimization; communication time reduction; compiler; data communication; distributed memory architecture; global data reallocation operation; low-level GAS Net networking layer; one-sided bulk put-get routine; parallel computer; parallel programming language; productivity; remote array element; runtime; suboptimal performance; user intervention; Arrays; Indexes; Productivity; Reactive power; Runtime; Standards; Block distribution; Chapel; communication aggregation; cyclic distribution; data re-distribution; one-sided communications;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
  • Conference_Location
    New York, NY
  • ISSN
    1550-6533
  • Print_ISBN
    978-1-4673-4790-7
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2012.18
  • Filename
    6374794