• DocumentCode
    598617
  • Title

    Dataflow-driven GPU performance projection for multi-kernel transformations

  • Author

    Jiayuan Meng ; Morozov, V.A. ; Vishwanath, Venkatram ; Kumaran, Kalyan

  • fYear
    2012
  • fDate
    10-16 Nov. 2012
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    Applications often have a sequence of parallel operations to be offloaded to graphics processors; each operation can become an individual GPU kernel. Developers typically explore a variety of transformations for each kernel. Furthermore, it is well known that efficient data management is critical in achieving high GPU performance and that "fusing" multiple kernels into one may greatly improve data locality. Doing so, however, requires transformations across multiple, potentially nested, parallel loops; at the same time, the original code semantics and data dependency must be preserved. Since each kernel may have distinct data access patterns, their combined dataflow can be nontrivial. As a result, the complexity of multi-kernel transformations often leads to significant effort with no guarantee of performance benefits. This paper proposes a dataflow-driven analytical framework to project GPU performance for a sequence of parallel operations. Users need only provide CPU code skeletons for a sequence of parallel loops. The framework can then automatically identify opportunities for multi-kernel transformations and data management. It is also able to project the overall performance without implementing GPU code or using physical hardware.
  • Keywords
    graphics processing units; CPU code skeletons; code semantics; data access patterns; data dependency; data locality improvement; data management; dataflow-driven GPU performance projection; graphics processors; multikernel transformations; parallel loops; parallel operations; physical hardware; Arrays; Fuses; Graphics processing units; Instruction sets; Kernel; Optimization; Skeleton;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    2167-4329
  • Print_ISBN
    978-1-4673-0805-2
  • Type

    conf

  • DOI
    10.1109/SC.2012.42
  • Filename
    6468531