DocumentCode
1987550
Title
Kokkos: Enabling Performance Portability Across Manycore Architectures
Author
Edwards, H. Carter ; Trott, Christian R.
Author_Institution
Sandia Nat. Labs., Albuquerque, NM, USA
fYear
2013
fDate
15-16 Aug. 2013
Firstpage
18
Lastpage
24
Abstract
The manycore revolution in computational hardware can be characterized by increasing thread counts, decreasing memory per thread, and architecture specific performance constraints for memory access patterns. High performance computing (HPC) on emerging many core architectures requires codes to exploit every opportunity for thread-level parallelism and satisfy conflicting performance constraints. We developed the Kokkos C++ library to provide scientific and engineering codes with a user accessible many core performance portable programming model. The two foundational abstractions of Kokkos are (1) dispatch work to a many core device for parallel execution and (2) manage multidimensional arrays with polymorphic layouts. The integration of these abstractions enables users´ code to satisfy multiple architecture specific memory access pattern performance constraints without having to modify their source code. In this paper we describe the Kokkos abstractions, summarize its application programmer interface (API), and present performance results for a molecular dynamics computational kernel and finite element mini-application.
Keywords
C++ language; application program interfaces; multi-threading; multiprocessing systems; parallel architectures; software libraries; software portability; source code (software); API; HPC; Kokkos C++ library; Kokkos abstractions; application programmer interface; computational hardware; finite element miniapplication; high-performance computing; manycore architectures; manycore device; memory access patterns; molecular dynamics computational kernel; multidimensional array management; multiple architecture specific memory access pattern performance constraints; parallel execution; polymorphic layouts; source code; thread counts; thread-level parallelism; user accessible manycore performance portable programming model; Arrays; Indexes; Kernel; Layout; Libraries; Performance evaluation; Programming;
fLanguage
English
Publisher
ieee
Conference_Titel
Extreme Scaling Workshop (XSW), 2013
Conference_Location
Boulder, CO
Type
conf
DOI
10.1109/XSW.2013.7
Filename
6805038
Link To Document