Title :
Energy considerations in checkpointing and fault tolerance protocols
Author :
el Mehdi Diouri, M. ; Glück, Olivier ; Lefevre, Laurent ; Cappello, Franck
Author_Institution :
Lab. de l´´Inf. du Parallelisme, Univ. Lyon 1, Lyon, France
Abstract :
Exascale supercomputers will gather hundreds millions cores. The first problem that we address is resiliency and fault tolerance to reach application termination on such platforms. The second problem is energy consumption since such systems will consume enormous amount of energy. In this paper, we evaluate checkpointing and existing fault tolerance protocols from an energy point of view. We measure on a real testbed the power consumption of the main atomic operations found in these protocols. The first results show that process coordination and RAM consume more power than checkpointing and HDD logging. However, the results we presented in Joules per Bytes for I/O operations, emphasize that checkpointing and HDD logging consume more energy than RAM logging. Finally, we propose to consider energy consumption as a criterion for the choice of fault tolerance protocols. In terms of energy consumption, we should promote message logging for applications exchanging small volumes of data and coordination for applications involving few processes.
Keywords :
checkpointing; fault tolerance; input-output programs; mainframes; power consumption; protocols; random-access storage; system monitoring; HDD logging; I-O operations; RAM logging; checkpointing; data volumes; energy consumption; energy point of view; exascale supercomputers; fault tolerance protocols; main atomic operations; power consumption; process coordination; Checkpointing; Energy consumption; Fault tolerance; Fault tolerant systems; Power demand; Protocols; Random access memory; Checkpointing; Energy consumption; Evaluation; Fault tolerance protocols;
Conference_Titel :
Dependable Systems and Networks Workshops (DSN-W), 2012 IEEE/IFIP 42nd International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4673-2264-5
Electronic_ISBN :
978-1-4673-2265-2
DOI :
10.1109/DSNW.2012.6264670