• DocumentCode
    2320447
  • Title

    GPU Performance Enhancement via Communication Cost Reduction: Case Studies of Radix Sort and WSN Relay Node Placement Problem

  • Author

    Lee, Che-Rung ; Lo, Shih-Hsiang ; Chen, Nan-Hsi ; Chung, Yeh-Ching ; Chung, I-Hsin

  • Author_Institution
    Dept. of Comput. Sci., Nat. TsingHua Univ. HsinChu, Hsinchu, Taiwan
  • fYear
    2012
  • fDate
    13-16 May 2012
  • Firstpage
    132
  • Lastpage
    139
  • Abstract
    As the computational power of Graphics Processing Unit (GPU) increases, data transmission becomes the major performance bottleneck. In this study, we investigate two techniques, data streaming and data compression, to reduce the communication cost on GPU. Data streaming enables overlap of communication and computation, whereas data compression reduces the data size transferred among different memory spaces. Although both techniques increase computation cost, overall performance can still be enhanced by reducing communication cost. We demonstrate the effectiveness of the two techniques via two case studies: radix sort and 3-star, a deployment algorithm in wireless sensor networks. For radix sort, a new algorithm, which mixes MSD and LSD algorithms and employs data streaming, is presented. Its performance is 25% faster than the fastest GPU radix sort implementation currently available in the public domain. For the 3-star algorithm, the speed increases several hundreds of times faster than that obtained by the CPU code. The data streaming and data compression, which is a hybrid CPU-GPU algorithm, provide an additional 54% performance improvement to the GPU implementation. Data compression not only reduces communication cost, but also improves the computation time, by which further performance enhancement can be achieved.
  • Keywords
    cost reduction; data communication; data compression; graphics processing units; performance evaluation; wireless sensor networks; 3-star algorithm; CPU code; CPU-GPU algorithm; GPU performance enhancement; LSD algorithms; MSD algorithms; WSN relay node placement problem; communication cost reduction; computational power; data compression; data streaming; data transmission; graphics processing unit; memory spaces; radix sort; wireless sensor networks; Approximation algorithms; Graphics processing unit; Instruction sets; Kernel; Partitioning algorithms; Relays; Wireless sensor networks; GPU; data compression; data streaming; radix sort; wireless sensor networks;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on
  • Conference_Location
    Ottawa, ON
  • Print_ISBN
    978-1-4673-1395-7
  • Type

    conf

  • DOI
    10.1109/CCGrid.2012.16
  • Filename
    6217414