• DocumentCode
    177303
  • Title

    Fine-grain task aggregation and coordination on GPUs

  • Author

    Orr, Marc S. ; Beckmann, Bradford M. ; Reinhardt, Steven K. ; Wood, David A.

  • Author_Institution
    Comput. Sci., Univ. of Wisconsin-Madison, Madison, WI, USA
  • fYear
    2014
  • fDate
    14-18 June 2014
  • Firstpage
    181
  • Lastpage
    192
  • Abstract
    In general-purpose graphics processing unit (GPGPU) computing, data is processed by concurrent threads executing the same function. This model, dubbed single-instruction/multiple-thread (SIMT), requires programmers to coordinate the synchronous execution of similar operations across thousands of data elements. To alleviate this programmer burden, Gaster and Howes outlined the channel abstraction, which facilitates dynamically aggregating asynchronously produced fine-grain work into coarser-grain tasks. However, no practical implementation has been proposed. To this end, we propose and evaluate the first channel implementation. To demonstrate the utility of channels, we present a case study that maps the fine-grain, recursive task spawning in the Cilk programming language to channels by representing it as a flow graph. To support data-parallel recursion in bounded memory, we propose a hardware mechanism that allows wavefronts to yield their execution resources. Through channels and wavefront yield, we implement four Cilk benchmarks. We show that Cilk can scale with the GPU architecture, achieving speedups of as much as 4.3x on eight compute units.
  • Keywords
    flow graphs; graphics processing units; parallel architectures; parallel programming; Cilk programming language; GPGPU; GPU architecture; SIMT; asynchronously produced fine-grain work; bounded memory; channel abstraction; coarser-grain tasks; concurrent threads; data elements; data-parallel recursion; fine-grain task aggregation; flow graph; general-purpose graphics processing unit; programmer burden; recursive task spawning; single-instruction-multiple-thread; Arrays; Flow graphs; Graphics processing units; Hardware; Kernel; Parallel processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture (ISCA), 2014 ACM/IEEE 41st International Symposium on
  • Conference_Location
    Minneapolis, MN
  • Print_ISBN
    978-1-4799-4396-8
  • Type

    conf

  • DOI
    10.1109/ISCA.2014.6853209
  • Filename
    6853209