DocumentCode :
3278485
Title :
A fault tolerant K-means algorithm based on storage-class memory
Author :
Guoliang Zhu Kai Lu ; Xu Li ; Kai Lu
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2013
fDate :
23-25 May 2013
Firstpage :
985
Lastpage :
988
Abstract :
Checkpointing is a pervasive method to provide fault-tolerance in high performance computing systems. However, with the rapid growth of system scale, more frequent checkpointing are needed, thus making the overhead brought by checkpointing intolerable. In this paper we present a checkpoint-free fault tolerance method. It takes advantage of non-volatile property of storage-class memory (SCM) to store execution-relevant data. Our method introduces negligible overhead to the algorithm when failures strikes and substantially reduces the recovery overhead. We add fault-tolerant capacity to K-means clustering algorithm using our method. Experimental results indicate that our approach introduces much less overhead than checkpointing does.
Keywords :
checkpointing; parallel machines; pattern clustering; random-access storage; software fault tolerance; ubiquitous computing; K-means clustering algorithm; SCM; checkpoint-free fault tolerance method; checkpointing; execution-relevant data storage; fault tolerant K-means algorithm; fault-tolerant capacity; high performance computing systems; nonvolatile property; pervasive method; recovery overhead reduction; storage-class memory; supercomputers; Educational institutions; Fault tolerance; Fault tolerant systems; Libraries; Checkpointing; Fault Tolerance; High-Performance Computing; K-means; Storage-class Memory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on
Conference_Location :
Beijing
ISSN :
2327-0586
Print_ISBN :
978-1-4673-4997-0
Type :
conf
DOI :
10.1109/ICSESS.2013.6615471
Filename :
6615471
Link To Document :
بازگشت