DocumentCode :
3206209
Title :
GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs
Author :
Abellan, Jose L. ; Fernandez, J. ; Acacio, Manuel E.
Author_Institution :
Dept. de Ing. y Tecnologfa de Comput., Univ. de Murcia, Murcia, Spain
fYear :
2011
fDate :
16-20 May 2011
Firstpage :
893
Lastpage :
905
Abstract :
Synchronization is of paramount importance to exploit thread-level parallelism on many-core CMPs. In these architectures, synchronization mechanisms usually rely on shared variables to coordinate multithreaded access to shared data structures thus avoiding data dependency conflicts. Lock synchronization is known to be a key limitation to performance and scalability. On the one hand, lock acquisition through busy waiting on shared variables generates additional coherence activity which interferes with applications. On the other hand, lock contention causes serialization which results in performance degradation. This paper proposes and evaluates GLocks, a hardware-supported implementation for highly-contended locks in the context of many-core CMPs. GLocks use a token based message-passing protocol over a dedicated network built on state-of-the-art technology. This approach skips the memory hierarchy to provide a non-intrusive, extremely efficient and fair lock implementation with negligible impact on energy consumption or die area. A comprehensive comparison against the most efficient shared-memory-based lock implementation for a set of microbenchmarks and real applications quantifies the goodness of GLocks. Performance results show an average reduction of 42% and 14% in execution time, an average reduction of 76% and 23% in network traffic, and also an average reduction of 78% and 28% in energy-delay2 product (ED2P) metric for the full CMP for the microbenchmarks and the real applications, respectively. In light of our performance results, we can conclude that GLocks satisfy our initial working hypothesis. GLocks minimize cache-coherence network traffic due to lock synchronization which translates into reduced power consumption and execution time.
Keywords :
cache storage; data structures; energy consumption; message passing; multi-threading; multiprocessing systems; parallel processing; power aware computing; GLocks; cache-coherence network traffic; dedicated network; energy consumption; energy-delay2 product metric; execution time reduction; highly-contended locks; lock synchronization; many-Core CMP; memory hierarchy; multithreaded access coordination; power consumption reduction; shared data structures; synchronization mechanism; thread-level parallelism; token based message-passing protocol; Hardware; Instruction sets; Proposals; Protocols; Radiation detectors; Scalability; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International
Conference_Location :
Anchorage, AK
ISSN :
1530-2075
Print_ISBN :
978-1-61284-372-8
Electronic_ISBN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2011.87
Filename :
6012899
Link To Document :
بازگشت