Title :
Concurrent rollback for crash recovery in extended hypercube networks
Author :
Juang, Tong-Yig ; Chiu, C.P. ; Yu, Kun-Ming
Author_Institution :
Dept. of Comput. Sci., Chung-Hua Polytech. Inst., Hsin Chu, Taiwan
Abstract :
Recovering from processor failures is an important problem in the design and development of reliable systems. We present a concurrent rollback algorithm in extended hypercube networks to recover from crash failures which involves small message and time complexities. The network of an extended hypercube is a hierarchical, low diameter, recursive structure. By appending only O(1) additional information to each message, we use less than O(Nlog N) message exchanges and O(log2 N) time elapsed for recovery work where N is the number of processors of the extended hypercube network. The algorithms can be used to recover from the failure of an arbitrary number of processors
Keywords :
communication complexity; computational complexity; fault tolerant computing; hypercube networks; parallel algorithms; system recovery; concurrent rollback algorithm; crash failures; crash recovery; extended hypercube networks; hierarchical low diameter recursive structure; message exchanges; processor failures; reliable systems; small message complexity; small time complexity; Buildings; Checkpointing; Communication networks; Computer crashes; Computer network reliability; Computer networks; Computer science; Hypercubes; Intelligent networks; System recovery;
Conference_Titel :
Parallel Algorithms/Architecture Synthesis, 1995. Proceedings., First Aizu International Symposium on
Conference_Location :
Fukushima
Print_ISBN :
0-8186-7038-X
DOI :
10.1109/AISPAS.1995.401336