Title :
Modeling and Predicting Performance of High Performance Computing Applications on Hardware Accelerators
Author :
Meswani, Mitesh R. ; Carrington, Laura ; Unat, Didem ; Snavely, Allan ; Baden, Scott ; Poole, Stephen
Author_Institution :
SDSC, UCSD, La Jolla, CA, USA
Abstract :
Computers with hardware accelerators, also referred to as hybrid-core systems, speedup applications by offloading certain compute operations that can run faster on accelerators. Thus, it is not surprising that many of top500 supercomputers use accelerators. However, in addition to procurement cost, significant programming and porting effort is required to realize the potential benefit of such accelerators. Hence, before building such a system it is prudent to answer the question ´what is the projected performance benefit from accelerators for the workloads of interest?´. We address this question by way of a performance-modeling framework that predicts realizable application performance on accelerators rapidly and accurately without going to the considerable effort of porting and tuning. The modeling framework first automatically identifies commonly found compute patterns in scientific applications which we term idioms, which may benefit by accelerator technology. Next the framework models the predicted speedup of those idioms if they were to be ported to and run on hardware accelerators. As a proof of concept we characterize two kinds of accelerators 1) the FPGA accelerators on a Convey HC-1 system and 2) an NVIDIA FERMI GPU accelerator. We model performance of the idioms gather/scatter and stream and our predictions show that where these occur in two full-scale HPC applications, Milc and HYCOM, gather/scatter speeds up by as much as 15X, and stream by as much as 14X, whereas the overall compute time of Milc improves by 3.4% and HYCOM by 20%.
Keywords :
field programmable gate arrays; graphics processing units; parallel machines; FPGA accelerator; HPC application; HYCOM; Milc; NVIDIA FERMI GPU accelerator; hardware accelerator; high performance computing; hybrid-core system; performance-modeling framework; procurement cost; supercomputer; Bandwidth; Computational modeling; Field programmable gate arrays; Graphics processing unit; Hardware; Performance evaluation; Predictive models; FPGA; GPU; HPC; accelerators; benchmarking; performance modeling; performance prediction;
Conference_Titel :
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0974-5
DOI :
10.1109/IPDPSW.2012.226