Title :
Do inputs matter? using data-dependence profiling to evaluate thread level speculation in BG/Q
Author :
Bhattacharyya, A.
Author_Institution :
Dept. of Comput. Sci., Univ. of Alberta, Edmonton, AB, Canada
Abstract :
Figure 1 shows the performance of three parallel versions (auto-SIMDized, auto-SIMDized+auto-OpenMP by bgxlc r and auto-SIMDized+auto-OpenMP+speculatively parallelized by an automatic speculative parallelization framework developed) of the SPEC2006 and PolyBench/C benchmarks. The speculative loops in lbm have 98% coverage that accounts for the speedup while in bzip2(35%) and dynprog (26%), the poor coverage of speculative loops introduces overhead. h264ref has the highest number of loops speculatively parallelized (47) but most of them have function calls that introduce dependences, thus causing slowdown (only 12% of speculative threads successfully committed). Filtering speculative execution of loops with non-side-effect-free function calls tackles the mispeculation overhead. cholesky and dynprog experience L1 cache misses due to LR mode(12% and 10% respectively) while jacobi and seidel experience huge dynamic path length increase (112% and 123% respectively over sequential).
Keywords :
cache storage; parallel programming; BG/Q; PolyBench/C benchmarks; SPEC2006; cholesky experience L1 cache; data-dependence profiling; dynprog experience L1 cache; non-side-effect-free function calls; thread level speculation; Benchmark testing; Educational institutions; Electronic mail; Hardware; Runtime; Software;
Conference_Titel :
Parallel Architectures and Compilation Techniques (PACT), 2013 22nd International Conference on
Conference_Location :
Edinburgh
Print_ISBN :
978-1-4799-1018-2
DOI :
10.1109/PACT.2013.6618836