• DocumentCode
    169131
  • Title

    Cross-Layer Self-Adaptive/Self-Aware System Software for Exascale Systems

  • Author

    Gioiosa, R. ; Kestor, G. ; Kerbyson, D.J. ; Hoisie, A.

  • Author_Institution
    High Performance Comput., Pacific Northwest Nat. Lab., Richland, WA, USA
  • fYear
    2014
  • fDate
    22-24 Oct. 2014
  • Firstpage
    326
  • Lastpage
    333
  • Abstract
    The extreme level of parallelism coupled with the limited available power budget expected in the exascale era brings unprecedented challenges that demand optimization of performance, power and resiliency in unison. Scalability on such systems is of paramount importance, while power and reliability issues may change the execution environment in which a parallel application runs. To solve these challenges exascale systems will require an introspective system software that combines system and application observations across all system stack layers with online feedback and adaptation mechanisms. In this paper we propose the design of a novel self-aware, selfadaptive system software in which a kernel-level Monitor, which continuously inspects the evolution of the target system through observation of Sensors, is combined with a user-level Controller, which reacts to changes in the execution environment, explores opportunities to increase performance, save power and adapts applications to new execution scenarios. We show that the monitoring system accurately monitors the evolution of parallel applications with a runtime overhead below 1-2%. As a test case, we design and implement a runtime system that aims at optimizing application´s performance and system power consumption on complex hierarchical architectures. Our results show that our adaptive system reaches 98% of performance efficiency of manually-tuned applications.
  • Keywords
    parallel processing; power aware computing; self-adjusting systems; sensors; software fault tolerance; software performance evaluation; adaptation mechanisms; complex hierarchical architectures; cross-layer self-adaptive system software; cross-layer self-aware system software; exascale systems; execution environment; introspective system software; kernel-level monitoring; manually-tuned applications; online feedback; parallel application; parallel applications; performance demand optimization; reliability issues; sensors; system power consumption; user-level controller; Hardware; Monitoring; Power demand; Runtime; Temperature measurement; Temperature sensors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2014 IEEE 26th International Symposium on
  • Conference_Location
    Jussieu
  • ISSN
    1550-6533
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2014.29
  • Filename
    6970681