DocumentCode
1783256
Title
Algorithmic Time, Energy, and Power on Candidate HPC Compute Building Blocks
Author
Jee Choi ; Dukhan, Marat ; Xing Liu ; Vuduc, Richard
Author_Institution
Sch. of Comput. Sci. & Eng., Georgia Inst. of Technol., Atlanta, GA, USA
fYear
2014
fDate
19-23 May 2014
Firstpage
447
Lastpage
457
Abstract
We conducted a micro benchmarking study of the time, energy, and power of computation and memory access on several existing platforms. These platforms represent candidate compute-node building blocks of future high-performance computing systems. Our analysis uses the "energy roofline" model, developed in prior work, which we extend in two ways. First, we improve the model\´s accuracy by accounting for power caps, basic memory hierarchy access costs, and measurement of random memory access patterns. Secondly, we empirically evaluate server-, mini-, and mobile-class platforms that span a range of compute and power characteristics. Our study includes a dozen such platforms, including x86 (both conventional and Xeon Phi), ARM, GPU, and hybrid (AMD APU and other SoC) processors. These data and our model analytically characterize the range of algorithmic regimes where we might prefer one building block to others. It suggests critical values of arithmetic intensity around which some systems may switch from being more to less time- and energy-efficient than others, it further suggests how, with respect to intensity, operations should be throttled to meet a power cap. We hope our methods can help make debates about the relative merits of these and other systems more quantitative, analytical, and insightful.
Keywords
parallel processing; power aware computing; AMD-APU processor; ARM processor; GPU processor; HPC compute-node building blocks; SoC processor; Xeon Phi x86 processor; algorithmic time; analytical analysis; arithmetic intensity; basic-memory hierarchy access costs; compute characteristics; conventional x86 processor; critical values; empirical evaluation; energy roofline model; energy-efficiency; high-performance computing systems; hybrid processor; memory access pattern measurement; microbenchmarking study; miniclass platform; mobile-class platform; model accuracy improvement; power caps; power characteristics; quantitative analysis; server-class platform; time-efficiency; Abstracts; Algorithm design and analysis; Computational modeling; Graphics processing units; Mobile communication; Power measurement; algorithms; energy; performance modeling; power; system balance;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium, 2014 IEEE 28th International
Conference_Location
Phoenix, AZ
ISSN
1530-2075
Print_ISBN
978-1-4799-3799-8
Type
conf
DOI
10.1109/IPDPS.2014.54
Filename
6877278
Link To Document