• DocumentCode
    2483587
  • Title

    Parallel data-locality aware stencil computations on modern micro-architectures

  • Author

    Christen, Matthias ; Schenk, Olaf ; Neufeld, Esra ; Messmer, Peter ; Burkhart, Helmar

  • Author_Institution
    Comput. Sci. Dept., Univ. of Basel, Basel, Switzerland
  • fYear
    2009
  • fDate
    23-29 May 2009
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    Novel micro-architectures including the Cell Broadband Engine Architecture and graphics processing units are attractive platforms for compute-intensive simulations. This paper focuses on stencil computations arising in the context of a biomedical simulation and presents performance benchmarks on both the Cell BE and GPUs and contrasts them with a benchmark on a traditional CPU system. Due to the low arithmetic intensity of stencil computations, typically only a fraction of the peak performance of the compute hardware is reached. An algorithm is presented, which reduces the bandwidth requirements and thereby improves performance by exploiting temporal locality of the data. We report on performance improvements over CPU implementations.
  • Keywords
    coprocessors; digital simulation; medical computing; multiprocessing systems; parallel processing; performance evaluation; Cell Broadband Engine Architecture; biomedical simulation; graphics processing units; modern microarchitectures; parallel data-locality; performance benchmarks; stencil computations; Arithmetic; Biomedical computing; Central Processing Unit; Computational modeling; Computer architecture; Concurrent computing; Context modeling; Engines; Graphics; Hardware;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
  • Conference_Location
    Rome
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-3751-1
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2009.5161031
  • Filename
    5161031