DocumentCode :
3062350
Title :
Benefits of Software Rejuvenation on HPC Systems
Author :
Naksinehaboon, Nichamon ; Taerat, Narate ; Leangsuksun, Chokchai ; Chandler, Clayton F. ; Scott, Stephen L.
Author_Institution :
Coll. of Eng. & Sci., Louisiana Tech Univ., Ruston, LA, USA
fYear :
2010
fDate :
6-9 Sept. 2010
Firstpage :
499
Lastpage :
506
Abstract :
Rejuvenation is a technique expected to mitigate failures in HPC systems by replacing, repairing, or resetting system components. Because of the small overhead required by software rejuvenation, we primarily focus on OS/kernel rejuvenation. In this paper, we propose three rejuvenation scheduling techniques. Moreover, we investigate the claim that software rejuvenation prolongs failure times in HPC systems. Also, we compare the lost computing times of the checkpoint/restart mechanism with and without rejuvenation after each checkpoint.
Keywords :
checkpointing; object-oriented programming; operating system kernels; HPC system; OS-kernel rejuvenation; checkpoint-restart mechanism; failure mitigation; rejuvenation scheduling; software rejuvenation; system component repair; system component replacement; system component resetting; Availability; Hardware; Kernel; Numerical models; Software reliability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing with Applications (ISPA), 2010 International Symposium on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-8095-1
Electronic_ISBN :
978-0-7695-4190-7
Type :
conf
DOI :
10.1109/ISPA.2010.82
Filename :
5634374
Link To Document :
بازگشت