DocumentCode :
579755
Title :
Efficient Sorting on the Tilera Manycore Architecture
Author :
Morari, Alessandro ; Tumeo, Antonino ; Villa, Oreste ; Secchi, Simone ; Valero, Mateo
Author_Institution :
Pacific Northwest Nat. Lab., Richland, WA, USA
fYear :
2012
fDate :
24-26 Oct. 2012
Firstpage :
171
Lastpage :
178
Abstract :
We present an efficient implementation of the radix sort algorithm for the Tilera TILEPro64 processor. The TILEPro64 is one of the first successful commercial manycore processors. It is composed of 64 tiles interconnected through multiple fast Networks-on-chip and features a fully coherent, shared distributed cache. The architecture has a large degree of flexibility, and allows various optimization strategies. We describe how we mapped the algorithm to this architecture. We present an in-depth analysis of the optimizations for each phase of the algorithm with respect to the processor´s sustained performance. We discuss the overall throughput reached by our radix sort implementation (up to 132 MK/s) and show that it provides comparable or better performance-per-watt with respect to state-of-the art implementations on x86 processors and graphic processing units.
Keywords :
graphics processing units; network-on-chip; optimisation; shared memory systems; sorting; Tilera TILEPro64 processor; Tilera manycore architecture; commercial manycore processors; graphic processing units; networks-on-chip; optimization strategies; radix sort algorithm; shared distributed cache; Bandwidth; Computer architecture; Histograms; Instruction sets; Optimization; Sorting; Tiles;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on
Conference_Location :
New York, NY
ISSN :
1550-6533
Print_ISBN :
978-1-4673-4790-7
Type :
conf
DOI :
10.1109/SBAC-PAD.2012.41
Filename :
6374786
Link To Document :
بازگشت