• DocumentCode
    1987550
  • Title

    Kokkos: Enabling Performance Portability Across Manycore Architectures

  • Author

    Edwards, H. Carter ; Trott, Christian R.

  • Author_Institution
    Sandia Nat. Labs., Albuquerque, NM, USA
  • fYear
    2013
  • fDate
    15-16 Aug. 2013
  • Firstpage
    18
  • Lastpage
    24
  • Abstract
    The manycore revolution in computational hardware can be characterized by increasing thread counts, decreasing memory per thread, and architecture specific performance constraints for memory access patterns. High performance computing (HPC) on emerging many core architectures requires codes to exploit every opportunity for thread-level parallelism and satisfy conflicting performance constraints. We developed the Kokkos C++ library to provide scientific and engineering codes with a user accessible many core performance portable programming model. The two foundational abstractions of Kokkos are (1) dispatch work to a many core device for parallel execution and (2) manage multidimensional arrays with polymorphic layouts. The integration of these abstractions enables users´ code to satisfy multiple architecture specific memory access pattern performance constraints without having to modify their source code. In this paper we describe the Kokkos abstractions, summarize its application programmer interface (API), and present performance results for a molecular dynamics computational kernel and finite element mini-application.
  • Keywords
    C++ language; application program interfaces; multi-threading; multiprocessing systems; parallel architectures; software libraries; software portability; source code (software); API; HPC; Kokkos C++ library; Kokkos abstractions; application programmer interface; computational hardware; finite element miniapplication; high-performance computing; manycore architectures; manycore device; memory access patterns; molecular dynamics computational kernel; multidimensional array management; multiple architecture specific memory access pattern performance constraints; parallel execution; polymorphic layouts; source code; thread counts; thread-level parallelism; user accessible manycore performance portable programming model; Arrays; Indexes; Kernel; Layout; Libraries; Performance evaluation; Programming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Extreme Scaling Workshop (XSW), 2013
  • Conference_Location
    Boulder, CO
  • Type

    conf

  • DOI
    10.1109/XSW.2013.7
  • Filename
    6805038