Title :
An Efficient Barrier Implementation for OpenMP-Like Parallelism on the Intel SCC
Author :
Al-Khalissi, Hayder ; Shah, S.A.A. ; Berekovic, Mladen
Author_Institution :
Chip-Design for Embedded Syst., Tech. Univ. Braunschweig, Braunschweig, Germany
Abstract :
This paper proposes an effective barrier synchronization implementations for shared memory-based parallel programming models (e.g. OpenMP) on the Intel SCC non-cache- coherent platform. Barrier synchronization primitives are key components of these programming models to coordinate the parallel threads. Therefore, we need an efficient implementation of the underlying synchronization algorithms to allow high-level barrier constructs for better performance. In particular, we present an efficient evaluation method to determine the overhead associated with integration of barrier algorithms that is required for OpenMP runtime libraries. We validate several implementation variants that efficiently use the network topology and SCC-specific hardware. Our experimental results for different Micro- benchmarks show significant performance improvement up to 98% for 48 cores.
Keywords :
application program interfaces; cache storage; cloud computing; parallel programming; shared memory systems; synchronisation; system-on-chip; Intel SCC noncache-coherent platform; OpenMP runtime libraries; OpenMP-like parallelism; barrier synchronization primitives; microbenchmarks; network topology; parallel threads; shared memory-based parallel programming models; single-chip cloud computer; system-on-chip; Delays; Instruction sets; Message systems; Programming; Radiation detectors; Synchronization; Table lookup; Barrier synchronization; Many- cores; OpenMP model; Performance Evaluation; System-on-Chip;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on
Conference_Location :
Torino
DOI :
10.1109/PDP.2014.25