• DocumentCode
    1815160
  • Title

    Taming the beast some thoughts on exascale resiliency

  • Author

    Troger, Peter

  • Author_Institution
    Hasso Plattner Inst., Univ. of Potsdam, Potsdam, Germany
  • fYear
    2013
  • fDate
    1-5 July 2013
  • Firstpage
    556
  • Lastpage
    557
  • Abstract
    The design and operation of high performance computing (HPC) infrastructures is, and always was, a huge technological challenge. Whenever the next generation of HPC system was about to be designed in the past, the community faced an ever-growing number of compute nodes and storage capacity, increasing heterogeneity of software, a new level of nonlinear computational load, questions of energy consumption and cooling, and many other non-functional issues. So far, everybody managed to deal with these issues in a exceptional and creative way. This time, it is about to become really hard.
  • Keywords
    cooling; energy consumption; parallel programming; power aware computing; storage management; HPC infrastructures; compute nodes; cooling; energy consumption; exascale HPC systems; high performance computing infrastructures; next generation HPC system; nonlinear computational load; software heterogeneity; storage capacity; Correlation; Fault tolerance; Fault tolerant systems; Fault trees; High performance computing; Programming; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Simulation (HPCS), 2013 International Conference on
  • Conference_Location
    Helsinki
  • Print_ISBN
    978-1-4799-0836-3
  • Type

    conf

  • DOI
    10.1109/HPCSim.2013.6641469
  • Filename
    6641469