DocumentCode
3244902
Title
Count Sort for GPU Computing
Author
Sun, Weidong ; Ma, Zongmin
Author_Institution
Sch. of Comput., Shenyang Inst. of Aeronaut. Eng., Shenyang, China
fYear
2009
fDate
8-11 Dec. 2009
Firstpage
919
Lastpage
924
Abstract
Counting sort is a simple, stable and efficient sort algorithm with linear running time, which is a fundamental building block for many applications. This paper depicts the design issues of a data parallel implementation of the count sort algorithm on a commodity multiprocessor GPU using the Compute Unified Device Architecture (CUDA) platform, both from NVIDIA Corporation. The full parallel version runs much faster than any serial implementation on CPU with the loss of stability due to the limitation of the massive threads parallel model. But the thread-level parallel implementation still provides an efficient parallel sort primitive for many applications, which do not require stable sort or can be adapted for unstable subroutines.
Keywords
multiprocessing programs; parallel processing; software architecture; sorting; CUDA platform; GPU computing; commodity multiprocessor GPU; compute unified device architecture; counting sort; data parallel implementation; linear running time; sort algorithm; Application software; Bioinformatics; Computer architecture; Computer graphics; Concurrent computing; Hardware; Parallel processing; Parallel programming; Programming profession; Yarn; CUDA; Count sort; GPGPU; Parallel programming;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Systems (ICPADS), 2009 15th International Conference on
Conference_Location
Shenzhen
ISSN
1521-9097
Print_ISBN
978-1-4244-5788-5
Type
conf
DOI
10.1109/ICPADS.2009.30
Filename
5395307
Link To Document