• DocumentCode
    656165
  • Title

    Parallel Radix Sort on the AMD Fusion Accelerated Processing Unit

  • Author

    Delorme, Michael C. ; Abdelrahman, Tarek S. ; Chengyan Zhao

  • Author_Institution
    Edward S. Rogers Sr. Dept. of Electr. & Comput. Eng., Univ. of Toronto, Toronto, ON, Canada
  • fYear
    2013
  • fDate
    1-4 Oct. 2013
  • Firstpage
    339
  • Lastpage
    348
  • Abstract
    We design, implement and evaluate a parallel radix sort that simultaneously utilizes the CPU and GPU devices on the AMD Fusion APU. The parallel sort, referred to as Fusion Sort, partitions the sort keys between the CPU and GPU devices and utilizes the integrated memory system of the APU to avoid data copying between the devices. We identify three design issues that impact overhead and performance: the granularity of sharing between the two devices, the scheme of data partitioning and the allocation of data in memory regions accessible by each device. We present three variants of Fusion Sort that share data at coarse and fine granularities and with fixed and variable data partitioning schemes. In each variant, data is allocated to minimize the overhead of non-preferred memory accesses of each device. Our evaluation shows that fine-grain sharing with variable data partitioning performs the best. Further, Fusion Sort outperforms CPU-only and GPU-only parallel radix sorts by up to 1.8X and 1.9X respectively. These results demonstrate the viability of the integrated memory system of the APU in the context of sorting.
  • Keywords
    graphics processing units; multiprocessing systems; parallel processing; sorting; AMD fusion APU; AMD fusion accelerated processing unit; CPU; FusionSort; GPU devices; data partitioning schemes; integrated memory system; parallel radix sort; sort keys partitioning; Central Processing Unit; Graphics processing units; Kernel; Performance evaluation; Resource management; Synchronization; Tiles; CPU-GPU architectures; heterogenous architectures; performance evaluation; radix sort;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing (ICPP), 2013 42nd International Conference on
  • Conference_Location
    Lyon
  • ISSN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2013.43
  • Filename
    6687367