• DocumentCode
    2052626
  • Title

    A Sampling-Based Approach for Communication Libraries Auto-Tuning

  • Author

    Brunet, Élisabeth ; Trahay, François ; Denis, Alexandre ; Namyst, Raymond

  • Author_Institution
    Inst. Telecom, Telecom SudParis, Evry, France
  • fYear
    2011
  • fDate
    26-30 Sept. 2011
  • Firstpage
    299
  • Lastpage
    307
  • Abstract
    Communication performance is a critical issue in HPC applications, and many solutions have been proposed on the literature (algorithmic, protocols, etc.) In the meantime, computing nodes become massively multicore, leading to a real imbalance between the number of communication sources and the number of physical communication resources. Thus it is now mandatory to share network boards between computation flows, and to take this sharing into account while performing communication optimizations. In previous papers, we have proposed a model and a frame work for on-the-fly optimizations of multiplexed concurrent communication flows, and implemented this model in the NEWMADELEINE communication library. This library features optimization strategies able for example to aggregate several messages to reduce the number of packets emitted on the network, or to split messages to use several NICs at the same time. In this paper, we study the tuning of these dynamic optimization strategies. We show that some parameters and thresholds (rendezvous threshold, aggregation packet size) depend on the actual hardware, both host and NICs. We propose and implement a method based on sampling of the actual hardware to auto-tune our strategies. Moreover, we show that multi-rail can greatly benefit from performance predictions. We propose an approach for multi-rail that dynamically balance the data between NICs using predictions based on sampling.
  • Keywords
    multiprocessing systems; optimisation; NEWMADELEINE communication library; autotuning communication libraries; communication flows; communication optimizations; computation flows; dynamic optimization; network boards; physical communication resources; sampling based approach; Bandwidth; Hardware; Libraries; Optimization; Protocols; Receivers; Tuning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2011 IEEE International Conference on
  • Conference_Location
    Austin, TX
  • Print_ISBN
    978-1-4577-1355-2
  • Electronic_ISBN
    978-0-7695-4516-5
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2011.41
  • Filename
    6061148