• DocumentCode
    723708
  • Title

    An Approach for Energy Efficient Execution of Hybrid Parallel Programs

  • Author

    Ramapantulu, Lavanya ; Loghin, Dumitrel ; Yong Meng Teo

  • Author_Institution
    Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore, Singapore
  • fYear
    2015
  • fDate
    25-29 May 2015
  • Firstpage
    1000
  • Lastpage
    1009
  • Abstract
    Hybrid programming model is becoming increasingly popular for HPC applications as it has the dual-advantage of exploiting inter-node distributed-memory scalability and intra-node shared-memory performance in a cluster system. One of the key challenges for energy efficient execution of hybrid programs is to determine time and energy efficient hardware configurations among a large system configuration space. Given a hybrid program with an execution time deadline and an energy budget, we propose a measurement-based analytical modelling approach to determine these system configurations. In contrast to current approaches, we model both inter and intra-node resource overlaps, memory contention among cores within a node and network contention across multiple nodes. The model invalidated against direct measurement using five representative HPC applications on Intel Xeon and ARM clusters having diverse time-energy performance. We show that a Pareto frontier consisting of optimal configurations exist for a hybrid program running on homogeneous clusters. To further optimize the Pareto frontier, we introduce a new metric, useful computation ratio (UCR) to quantify the degree of resource contentions and communication overheads in an execution. We discuss how UCR and Pareto-optimal configurations can be used in conjunction by system´s designers to gain further insights into system resource imbalances, and how application developers can further fine-tune their hybrid programs.
  • Keywords
    Pareto optimisation; distributed shared memory systems; parallel programming; power aware computing; ARM clusters; HPC applications; Intel Xeon; Pareto frontier; Pareto-optimal configurations; UCR; energy efficient execution; energy efficient hardware configurations; hybrid parallel programs; hybrid programming model; internode distributed-memory scalability; intranode shared-memory performance; measurement-based analytical modeling approach; time efficient hardware configurations; time-energy performance; useful computation ratio; Analytical models; Clocks; Computational modeling; Current measurement; Hardware; Mathematical model; Memory management; MPI; OpenMP; Pareto-frontier; analytical model; hybrid program; inter-node; intra-node; overlap; resource contention; useful computation ratio;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
  • Conference_Location
    Hyderabad
  • ISSN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2015.71
  • Filename
    7161585