• DocumentCode
    618550
  • Title

    Accelerating atomic operations on GPGPUs

  • Author

    Franey, Sean ; Lipasti, M.

  • Author_Institution
    Electr. & Comput. Eng, Univ. of Wisconsin - Madison, Madison, WI, USA
  • fYear
    2013
  • fDate
    21-24 April 2013
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    General purpose computing on GPUs (GPGPU) has experienced rapid growth over the last several years as new application realms are explored and traditional highly parallel algorithms are adapted to this computational substrate. However, a large portion of the parallel workload space, both in emerging and traditional domains, remains ill-suited for GPGPU deployment due to high reliance on atomic operations, particularly as global synchronization mechanisms. Unlike the sophisticated synchronization primitives available on supercomputers, GPGPU applications must rely on slow atomic operations on shared data. Further, unlike general purpose processors which take advantage of coherent L1 caches to speed up atomic operations, the cost and complexity of coherency on the GPU, coupled with the fact that a GPU´s primary revenue stream - graphics rendering - does not benefit, means that new approaches are needed to improve atomics on the GPU. In this paper, we present a mechanism for implementing low-cost coherence and speculative acquisition of atomic data on the GPU that allows applications that utilize atomics to greater extents than is generally accepted practice today, to perform much better than they do on current hardware. As our results show, these unconventional applications can realize non-trivial performance improvements approaching 20% with our proposed system. With this mechanism, the scope of applications that can be accelerated by these commodity, highly-parallel pieces of hardware can be greatly expanded.
  • Keywords
    cache storage; graphics processing units; mainframes; rendering (computer graphics); GPGPU; L1 caches; atomic operations; general purpose computing; graphics rendering; supercomputers; Coherence; Complexity theory; Graphics processing units; Hardware; Synchronization; Wires;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networks on Chip (NoCS), 2013 Seventh IEEE/ACM International Symposium on
  • Conference_Location
    Tempe, AZ
  • Print_ISBN
    978-1-4673-6491-1
  • Electronic_ISBN
    978-1-4673-6492-8
  • Type

    conf

  • DOI
    10.1109/NoCS.2013.6558404
  • Filename
    6558404