Title :
Evaluation and optimization of multicore performance bottlenecks in supercomputing applications
Author :
Diamond, Jeff ; Burtscher, Martin ; McCalpin, John D. ; Kim, Byoung-Do ; Keckler, Stephen W. ; Browne, James C.
Author_Institution :
Univ. of Texas at Austin, Austin, TX, USA
Abstract :
The computation nodes of modern supercomputers commonly consist of multiple multicore processors. To maximize the performance of such systems requires measurement, analysis, and optimization techniques that specifically target multicore environments. This paper first examines traditional unicore metrics and demonstrates how they can be misleading in a multicore system. Second, it examines and characterizes performance bottlenecks specific to multicore-based systems. Third, it describes performance measurement challenges that arise in multicore systems and outlines methods for extracting sound measurements that lead to performance optimization opportunities. The measurement and analysis process is based on a case study of the HOMME atmospheric modeling benchmark code from NCAR running on supercomputers built upon AMD Barcelona and Intel Nehalem quad-core processors. Applying the multicore bottleneck analysis to HOMME led to multicore aware source-code optimizations that increased performance by up to 35%. While the case studies were carried out on multichip nodes of supercomputers using an HPC application as the target for optimization, the pitfalls identified and the insights obtained should apply to any system that is composed of multicore processors.
Keywords :
mainframes; microprocessor chips; multiprocessing systems; performance evaluation; source coding; AMD Barcelona; HOMME atmospheric modeling benchmark code; HPC application; Intel Nehalem quadcore processor; NCAR; multichip nodes; multicore aware source code optimization; multicore performance bottlenecks; multicore processors; performance measurement; performance optimization; sound measurement; supercomputing; unicore metrics; Atmospheric modeling; Multicore processing; Optimization; Program processors; Scalability; Semiconductor device measurement;
Conference_Titel :
Performance Analysis of Systems and Software (ISPASS), 2011 IEEE International Symposium on
Conference_Location :
Austin, TX
Print_ISBN :
978-1-61284-367-4
Electronic_ISBN :
978-1-61284-368-1
DOI :
10.1109/ISPASS.2011.5762713