• DocumentCode
    3244902
  • Title

    Count Sort for GPU Computing

  • Author

    Sun, Weidong ; Ma, Zongmin

  • Author_Institution
    Sch. of Comput., Shenyang Inst. of Aeronaut. Eng., Shenyang, China
  • fYear
    2009
  • fDate
    8-11 Dec. 2009
  • Firstpage
    919
  • Lastpage
    924
  • Abstract
    Counting sort is a simple, stable and efficient sort algorithm with linear running time, which is a fundamental building block for many applications. This paper depicts the design issues of a data parallel implementation of the count sort algorithm on a commodity multiprocessor GPU using the Compute Unified Device Architecture (CUDA) platform, both from NVIDIA Corporation. The full parallel version runs much faster than any serial implementation on CPU with the loss of stability due to the limitation of the massive threads parallel model. But the thread-level parallel implementation still provides an efficient parallel sort primitive for many applications, which do not require stable sort or can be adapted for unstable subroutines.
  • Keywords
    multiprocessing programs; parallel processing; software architecture; sorting; CUDA platform; GPU computing; commodity multiprocessor GPU; compute unified device architecture; counting sort; data parallel implementation; linear running time; sort algorithm; Application software; Bioinformatics; Computer architecture; Computer graphics; Concurrent computing; Hardware; Parallel processing; Parallel programming; Programming profession; Yarn; CUDA; Count sort; GPGPU; Parallel programming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Systems (ICPADS), 2009 15th International Conference on
  • Conference_Location
    Shenzhen
  • ISSN
    1521-9097
  • Print_ISBN
    978-1-4244-5788-5
  • Type

    conf

  • DOI
    10.1109/ICPADS.2009.30
  • Filename
    5395307