• DocumentCode
    3608
  • Title

    A Cache Hierarchy Aware Thread Mapping Methodology for GPGPUs

  • Author

    Lai, Bo-Cheng Charles ; Hsien-Kai Kuo ; Jing-Yang Jou

  • Author_Institution
    Dept. of Electron. Eng., Nat. Chiao-Tung Univ., Hsinchu, Taiwan
  • Volume
    64
  • Issue
    4
  • fYear
    2015
  • fDate
    Apr-15
  • Firstpage
    884
  • Lastpage
    898
  • Abstract
    The recently proposed GPGPU architecture has added a multi-level hierarchy of shared cache to better exploit the data locality of general purpose applications. The GPGPU design philosophy allocates most of the chip area to processing cores, and thus results in a relatively small cache shared by a large number of cores when compared with conventional multi-core CPUs. Applying a proper thread mapping scheme is crucial for gaining from constructive cache sharing and avoiding resource contention among thousands of threads. However, due to the significant differences on architectures and programming models, the existing thread mapping approaches for multi-core CPUs do not perform as effective on GPGPUs. This paper proposes a formal model to capture both the characteristics of threads as well as the cache sharing behavior of multi-level shared cache. With appropriate proofs, the model forms a solid theoretical foundation beneath the proposed cache hierarchy aware thread mapping methodology for multi-level shared cache GPGPUs. The experiments reveal that the three-staged thread mapping methodology can successfully improve the data reuse on each cache level of GPGPUs and achieve an average of 2.3× to 4.3× runtime enhancement when compared with existing approaches.
  • Keywords
    cache storage; graphics processing units; integrated circuit design; multi-threading; performance evaluation; shared memory systems; GPGPU architecture; GPGPU design philosophy; cache hierarchy aware thread mapping methodology; chip area allocation; constructive cache sharing; data reuse improvement; multilevel shared cache hierarchy; performance analysis; processing cores; programming models; shared memory; Arrays; Graphics processing units; Instruction sets; Kernel; Message systems; Optimization; Multithreaded processors; cache memories; performance analysis and design aids; shared memory;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2014.2308179
  • Filename
    6747979