DocumentCode :
2027290
Title :
A lock-free cache-friendly software queue buffer for decoupled software pipelining
Author :
Chen, Wen Ren ; Yang, Wuu ; Hsu, Wei Chung
Author_Institution :
Programming Language & Syst. Lab., Nat. Chiao Tung Univ., Hsinchu, Taiwan
fYear :
2010
fDate :
16-18 Dec. 2010
Firstpage :
997
Lastpage :
1006
Abstract :
Multicore has become a trend on server and client computers in recent years. Parallelization is one way to fully utilize the computing power provided by multicore architectures. Most applications of interest have complex data and control dependency, which make traditional parallelization techniques, such as DOALL and DOACROSS, inapplicable. Decoupled Software Pipelining (DSWP), a new parallelization technique, shows its potential on parallelizing general applications. However, its success relies on fast inter-core synchronization and communication. On commodity multicore platforms, the performance of current DSWP disappoints us since the overhead involving lock-based, cache dishonored software approach offsets the benefit from DSWP. We present a lock-free, cache-friendly software queue designed for DSWP. A lock-free, cache-friendly solution need take two different aspects of memory system, memory coherence and memory consistency, into consideration. We show how inattention to these two aspects leads to incorrect or inefficient solutions. We also present our approach to providing a correct and efficient solution with detailed explanation. Due to the nondeterministic nature of parallel programs, traditional testing techniques cannot be used to fully verify the correctness of the implementation. We also discuss the correctness of our implementation both in informal and formal ways.
Keywords :
cache storage; multiprocessing systems; pipeline processing; DSWP; decoupled software pipelining; intercore synchronization; lock free cache friendly software; memory coherence; memory consistency; memory system; multicore architecture; parallel program; parallelization technique; queue buffer; Coherence; Delay; Hardware; Instruction sets; Multicore processing; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Symposium (ICS), 2010 International
Conference_Location :
Tainan
Print_ISBN :
978-1-4244-7639-8
Type :
conf
DOI :
10.1109/COMPSYM.2010.5685364
Filename :
5685364
Link To Document :
بازگشت