DocumentCode :
2720990
Title :
Assessment of barrier implementations for fine-grain parallel regions on current multi-core architectures
Author :
Berger, Simon A. ; Stamatakis, Alexandros
Author_Institution :
Dept. of Comput. Sci., Tech. Univ. Munchen, Garching, Germany
fYear :
2010
fDate :
20-24 Sept. 2010
Firstpage :
1
Lastpage :
8
Abstract :
Barrier performance for synchronizing threads on current multi-core systems can be critical for scientific applications that traverse a large number of relatively small parallel regions, that is, that exhibit an unfavorable computation to synchronization ratio. By means of a synthetic and a real-world benchmark we assess 4 alternative barrier implementations on 7 current multi-core systems with 2 up to 32 cores. We find that, barrier performance is application- and data-specific with respect to cache utilization, but that a rather naïve lock-free barrier implementation yields good results across all applications and multi-core systems tested. We also assess distinct implementations of reduction operations that are computed in conjunction with the barriers. The synthetic and real-world benchmarks are made available as open-source code for further testing.
Keywords :
cache storage; multiprocessing systems; parallel architectures; synchronisation; barrier implementations; cache utilization; fine-grain parallel regions; multicore architectures; open source code; threads synchronization; Benchmark testing; Instruction sets; Message systems; Optimization; Organisms; Radiation detectors; Synchronization; RAxML; barriers; multi-cores; threads;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on
Conference_Location :
Heraklion, Crete
Print_ISBN :
978-1-4244-8395-2
Electronic_ISBN :
978-1-4244-8397-6
Type :
conf
DOI :
10.1109/CLUSTERWKSP.2010.5613080
Filename :
5613080
Link To Document :
بازگشت