DocumentCode :
1918938
Title :
Performance Modeling of Shared Memory Multiple Issue Multicore Machines
Author :
Mitra, Rajendu ; Joshi, Bharat S. ; Ravindran, Ajith ; Mukherjee, Arjun ; Adams, Rene
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of North Carolina at Charlotte, Charlotte, NC, USA
fYear :
2012
fDate :
10-13 Sept. 2012
Firstpage :
464
Lastpage :
473
Abstract :
The process of developing optimal parallel applications is computationally expensive. The goal of this work is to design and validate a Markov chain based system-level performance prediction models to efficiently optimize parallel applications on shared memory multicore processors with coarse-grain thread level parallelism (TLP) like Intel Xeon Clover town. In Markov chain based throughput prediction model, the machine micro-architecture is represented by the different states and the allowable transitions. The program characteristics (such as cache misses, branch misprediction, division, denormalized computations and other large latency operations) are included using the failure probabilities of active and suspended threads. The improvement in performance is achieved by extracting information from running a representative data-set of the actual application. The model is validated with multiple benchmarks (electromagnetics application, parallel BZIP, FFT etc.) using VTune - Intel´s performance analyzer. The average performance prediction error is less than 10%. The total run time for model is of the order of minutes (including VTune analyzer measurement timings), whereas the actual application is in terms of few hours.
Keywords :
Markov processes; cache storage; multi-threading; parallel architectures; probability; shared memory systems; Intel performance analyzer; Markov chain based system-level performance prediction model; Markov chain based throughput prediction model; TLP; VTune analyzer measurement timing; branch misprediction; cache misses; coarse-grain thread level parallelism; failure probability; machine microarchitecture; multicore machines; optimal parallel application; performance modeling; performance prediction error; program characteristics; shared memory multicore processor; Analytical models; Computational modeling; Instruction sets; Markov processes; Mathematical model; Multicore processing; Analytical Modeling; Multithreading; Performance Prediction; Throughput model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing Workshops (ICPPW), 2012 41st International Conference on
Conference_Location :
Pittsburgh, PA
ISSN :
1530-2016
Print_ISBN :
978-1-4673-2509-7
Type :
conf
DOI :
10.1109/ICPPW.2012.64
Filename :
6337514
Link To Document :
بازگشت