• DocumentCode
    3585395
  • Title

    An OpenACC Extension for Data Layout Transformation

  • Author

    Hoshino, Tetsuya ; Maruyama, Naoya ; Matsuoka, Satoshi

  • Author_Institution
    Tokyo Inst. of Technol., Tokyo, Japan
  • fYear
    2014
  • Firstpage
    12
  • Lastpage
    18
  • Abstract
    OpenACC is gaining momentum as an implicit and portable interface in porting legacy CPU-based applications to heterogeneous, highly parallel computational environment involving many-core accelerators such as GPUs and Intel Xeon Phi. OpenACC provides a set of loop directives similar to OpenMP for the parallelization and also to manage data movement, attaining functional portability across different heterogeneous devices; however, the performance portability of OpenACC is said to be insufficient due to the characteristics of different target devices, especially those regarding memory layouts, as automated attempts by the compilers to adapt is currently difficult. We are currently working to propose a set of directives to allow compilers to have better semantic information for adaptation; here, we particularly focus on data layout such as Structure of Arrays, advantageous data structure for GPUs, as opposed to Array of Structures, which exhibits good performance on CPUs. We propose a directive extension to OpenACC that allows the users to flexibility specify optimal layouts, even if the data structures are nested. Performance results show that we gain as much as 96 % in performance for CPUs and 165% for GPUs compared to programs without such directives, essentially attaining both functional and performance portability in OpenACC.
  • Keywords
    data handling; graphics processing units; multiprocessing systems; GPU; Intel Xeon Phi; OpenACC extension; OpenMP; data layout transformation; data movement management; data structure; graphics processing unit; many-core accelerator; parallel computational environment; porting legacy CPU-based applications; semantic information; Arrays; Benchmark testing; Graphics processing units; Kernel; Layout; Performance evaluation; Programming;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Accelerator Programming using Directives (WACCPD), 2014 First Workshop on
  • Type

    conf

  • DOI
    10.1109/WACCPD.2014.12
  • Filename
    7081673