DocumentCode :
244403
Title :
Coarse-Grained Energy Modeling of Rollback/Recovery Mechanisms
Author :
Ibtesham, Dewan ; DeBonis, David ; Arnold, Dorian ; Ferreira, Kurt B.
Author_Institution :
Dept. of Comput. Sci., Univ. of New Mexico, Albuquerque, NM, USA
fYear :
2014
fDate :
23-26 June 2014
Firstpage :
708
Lastpage :
713
Abstract :
As high-performance computing systems continue to grow in size and complexity, energy efficiency and reliability have emerged as first-order concerns. Researchers have shown that data movement is a significant contributing factor to power consumption on these systems. Additionally, rollback/recovery protocols like checkpoint/restart can generate large volumes of data traffic exacerbating the energy and power concerns. In this work, we show that a coarse-grained model can be used effectively to speculate about the energy footprints of rollback/recovery protocols. Using our validated model, we evaluate the energy footprint of checkpoint compression, a method that incurs higher computational demand to reduce data volumes and data traffic. Specifically, we show that while checkpoint compression leads to more frequent checkpoints (as per the optimal checkpoint frequency) and increases per checkpoint energy cost, compression still yields a decrease in total application energy consumption due to the overall runtime decrease.
Keywords :
checkpointing; energy conservation; parallel processing; power consumption; protocols; software reliability; checkpoint compression; checkpoint energy cost; coarse-grained energy modeling; coarse-grained model; computational demand; data movement; data traffic; data volumes; energy consumption; energy efficiency; energy footprints; energy reliability; first-order concern; high-performance computing system; optimal checkpoint frequency; power consumption; rollback/recovery mechanisms; rollback/recovery protocol; runtime decrease; Energy consumption; Energy measurement; Optimization; Power measurement; Predictive models; Protocols; Time measurement; Checkpoint Compression; Checkpoint Restart; Fault Tolerance; Modeling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International Conference on
Conference_Location :
Atlanta, GA
Type :
conf
DOI :
10.1109/DSN.2014.71
Filename :
6903629
Link To Document :
بازگشت