DocumentCode :
1872455
Title :
CellMT: A cooperative multithreading library for the Cell/B.E.
Author :
Beltran, Vicen ; Carrera, David ; Torres, Jordi ; Ayguade, Eduard
Author_Institution :
Barcelona Super Comput. Center, Barcelona, Spain
fYear :
2009
fDate :
16-19 Dec. 2009
Firstpage :
245
Lastpage :
253
Abstract :
The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computational power with high efficiency for a wide range of applications. The simple design of the computational units and the use of small managed local memories is the key to achieve high efficiency and performance at the same time. However, this simple and efficient hardware design comes at the price of higher code complexity. The code written to run in this kind of processors must deal with several issues such as code vectorization, loop unrolling or the explicit management of local memories. Some of these issues such as vectorization or loop unrolling can be partially solved by the compiler, but the overlapping of data transfer and computation times must be manually addressed by the programmer with techniques such as double buffering that increase the code complexity. In this paper we present a user level threading library called CellMT that effectively hide memory latencies. The concurrent execution of several threads inside each SPU naturally overlaps computation and data transfer times without increasing the code complexity. To prove the suitability and feasibility of our multi-threaded library, we perform an exhaustive performance evaluation with a synthetic benchmark and a real application. The experimental results show that the multithreaded approach can outperform a hand-coded double buffering scheme, with speedups from 0.96x to 3.2x, while maintaining the complexity of a naive buffering scheme.
Keywords :
multi-threading; multiprocessing systems; program compilers; Cell BE processor; code complexity; code vectorization; compiler; cooperative multithreading library; data transfer; exhaustive performance evaluation; hand-coded double buffering scheme; hardware design; heterogeneous multicore systems; naive buffering scheme; user level threading library; Delay; Hardware; High performance computing; Libraries; Memory management; Multithreading; Power system management; Program processors; Programming profession; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing (HiPC), 2009 International Conference on
Conference_Location :
Kochi
Print_ISBN :
978-1-4244-4922-4
Electronic_ISBN :
978-1-4244-4921-7
Type :
conf
DOI :
10.1109/HIPC.2009.5433205
Filename :
5433205
Link To Document :
بازگشت