Title :
GRS — GPU radix sort for multifield records
Author :
Bandyopadhyay, Shibdas ; Sahni, Sartaj
Author_Institution :
Dept. of Comput. & Inf. Sci. & Eng., Univ. of Florida, Gainesville, FL, USA
Abstract :
We develop a radix sort algorithm, GRS, suitable to sort multifield records on a graphics processing unit (GPU). We assume the ByField layout for records to be sorted. GRS is benchmarked against the radix sort algorithm, SDK, in NVIDIA´s CUDA SDK 3.0 as well as the radix sort algorithm, SRTS, of Merrill and Grimshaw. Although SRTS is faster than both GRS and SDK when sorting numbers as well as records that have a key and an additional 32-bit field, both GRS and SDK outperform SRTS on records with 2 or more fields (in addition to the key). GRS is consistently faster than SDK on numbers as well as records with 1 or more fields. When sorting records with 9 32-bit fields, GRS is up to 74% faster than SRTS and up to 55% faster than SDK. Thus, GRS is the fastest way to radix sort records with more than 1 32-bit field on a GPU.
Keywords :
computer graphic equipment; coprocessors; parallel architectures; records management; sorting; ByField layout; GRS-GPU radix sort algorithm; NVIDIA CUDA SDK 3.0; SRTS; compute unified driver architecture; graphics processing unit; multifield record; Graphics processing unit; Histograms; Instruction sets; Kernel; Layout; Registers; Tiles; Graphics Processing Units; radix sort; sorting multifield records;
Conference_Titel :
High Performance Computing (HiPC), 2010 International Conference on
Conference_Location :
Dona Paula
Print_ISBN :
978-1-4244-8518-5
Electronic_ISBN :
978-1-4244-8519-2
DOI :
10.1109/HIPC.2010.5713164