DocumentCode :
2748826
Title :
Algorithm-Based Fault Tolerance Applied to P2P Computing Networks
Author :
Roche, Thomas ; Cunche, Mathieu ; Roch, Jean-Louis
Author_Institution :
CS (Commun. & Syst.), Le Plessis-Robinson, France
fYear :
2009
fDate :
11-16 Oct. 2009
Firstpage :
144
Lastpage :
149
Abstract :
P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a generalisation of the previous disk-less checkpointing approach for fault-tolerance in high performance computing systems. Our contribution is in two directions: first, instead of restricting to 2D checksums that tolerate only a small number of node failures, we propose to base disk-less checkpointing on linear codes to tolerate potentially a large number of faults. Then, we compare and analyse the use of low density parity check (LDPC) to classical Reed-Solomon (RS) codes with respect to different fault models to fit P2P systems. Our LDPC disk-less checkpointing method is well suited when only node disconnections are considered, but cannot deal with byzantine peers. Our RS disk-less checkpointing method tolerates such byzantine errors, but is restricted to exact finite field computations.
Keywords :
Reed-Solomon codes; checkpointing; fault tolerant computing; linear codes; parity check codes; peer-to-peer computing; LDPC; P2P computing network; Reed-Solomon code; algorithm-based fault tolerance; disk-less checkpointing; high performance computing system; linear code; low density parity check; Checkpointing; Computer networks; Fault tolerance; Fault tolerant systems; Galois fields; High performance computing; Linear code; Parity check codes; Peer to peer computing; Reed-Solomon codes; ABFT; P2P; SUMMA; distributed computing; fault-tolerance; linear coding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in P2P Systems, 2009. AP2PS '09. First International Conference on
Conference_Location :
Sliema
Print_ISBN :
978-1-4244-5084-8
Electronic_ISBN :
978-0-7695-3831-0
Type :
conf
DOI :
10.1109/AP2PS.2009.30
Filename :
5359030
Link To Document :
بازگشت