• DocumentCode
    3134816
  • Title

    Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures

  • Author

    Datta, Kaushik ; Murphy, Mark ; Volkov, Vasily ; Williams, Samuel ; Carter, Jonathan ; Oliker, Leonid ; Patterson, David ; Shalf, John ; Yelick, Katherine

  • Author_Institution
    CRD/NERSC, Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
  • fYear
    2008
  • fDate
    15-21 Nov. 2008
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    Understanding the most efficient design and utilization of emerging multicore systems is one of the most challenging questions faced by the mainstream and scientific computing industries in several decades. Our work explores multicore stencil (nearest-neighbor) computations - a class of algorithms at the heart of many structured grid codes, including PDE solvers. We develop a number of effective optimization strategies, and build an auto-tuning environment that searches over our optimizations and their parameters to minimize runtime, while maximizing performance portability. To evaluate the effectiveness of these strategies we explore the broadest set of multicore architectures in the current HPC literature, including the Intel Clovertown, AMD Barcelona, Sun Victoria Falls, IBM QS22 PowerXCell 8i, and NVIDIA GTX280. Overall, our auto-tuning optimization methodology results in the fastest multicore stencil performance to date. Finally, we present several key insights into the architectural tradeoffs of emerging multicore designs and their implications on scientific algorithm development.
  • Keywords
    computer architecture; grid computing; microprocessor chips; optimisation; partial differential equations; AMD Barcelona; IBM QS22 PowerXCell 8i; Intel Clovertown; NVIDIA GTX280; PDE solvers; Sun Victoria Falls; auto-tuning environment; auto-tuning optimization methodology; partial differential equation; state-of-the-art multicore architectures; stencil computation optimization; structured grid codes; Algorithm design and analysis; Computer architecture; Computer industry; Grid computing; Heart; Multicore processing; Optimization methods; Runtime environment; Scientific computing; Sun;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008. International Conference for
  • Conference_Location
    Austin, TX
  • Print_ISBN
    978-1-4244-2834-2
  • Electronic_ISBN
    978-1-4244-2835-9
  • Type

    conf

  • DOI
    10.1109/SC.2008.5222004
  • Filename
    5222004