• DocumentCode
    1783256
  • Title

    Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks

  • Author

    Jee Choi ; Dukhan, Marat ; Xing Liu ; Vuduc, Richard

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2014
  • fDate
    19-23 May 2014
  • Firstpage
    447
  • Lastpage
    457
  • Abstract
    We conducted a micro benchmarking study of the time, energy, and power of computation and memory access on several existing platforms. These platforms represent candidate compute-node building blocks of future high-performance computing systems. Our analysis uses the "energy roofline" model, developed in prior work, which we extend in two ways. First, we improve the model\´s accuracy by accounting for power caps, basic memory hierarchy access costs, and measurement of random memory access patterns. Secondly, we empirically evaluate server-, mini-, and mobile-class platforms that span a range of compute and power characteristics. Our study includes a dozen such platforms, including x86 (both conventional and Xeon Phi), ARM, GPU, and hybrid (AMD APU and other SoC) processors. These data and our model analytically characterize the range of algorithmic regimes where we might prefer one building block to others. It suggests critical values of arithmetic intensity around which some systems may switch from being more to less time- and energy-efficient than others, it further suggests how, with respect to intensity, operations should be throttled to meet a power cap. We hope our methods can help make debates about the relative merits of these and other systems more quantitative, analytical, and insightful.
  • Keywords
    parallel processing; power aware computing; AMD-APU processor; ARM processor; GPU processor; HPC compute-node building blocks; SoC processor; Xeon Phi x86 processor; algorithmic time; analytical analysis; arithmetic intensity; basic-memory hierarchy access costs; compute characteristics; conventional x86 processor; critical values; empirical evaluation; energy roofline model; energy-efficiency; high-performance computing systems; hybrid processor; memory access pattern measurement; microbenchmarking study; miniclass platform; mobile-class platform; model accuracy improvement; power caps; power characteristics; quantitative analysis; server-class platform; time-efficiency; Abstracts; Algorithm design and analysis; Computational modeling; Graphics processing units; Mobile communication; Power measurement; algorithms; energy; performance modeling; power; system balance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4799-3799-8
  • Type

    conf

  • DOI
    10.1109/IPDPS.2014.54
  • Filename
    6877278