Title :
Effective SBML Protocol for Tolerating Concurrent Failures Based on Broadcast Networks
Author :
Jinho Ahn ; Yoohwan Kim
Author_Institution :
Dept. of Comput. Sci., Kyonggi Univ., Suwon, South Korea
Abstract :
The generally accepted idea of sender-based message logging (SBML) is SBML cannot tolerate concurrent failures without rolling back non-faulty processes. This inherent limitation comes from its behavioral feature that a copy of the receive sequence number (RSN) of each message is kept only in its senders volatile memory. In this paper, we designed an effective SBML protocol based on the following two observations. First, in order to satisfy no rollback property of non-faulty processes even in case of simultaneous failures, replicating RSN of each message not only on its sender, but also other processes is essential. However, this redundancy generally requires high extra communication overhead. Second, all the previous SBML protocols are oblivious to the underlying network. This feature may not fundamentally provide any breakthrough for ensuring high scalability required in a cluster system composed of a large number of nodes based on a broadcasting network. The protocol enables the critical limitation to be lifted without any abandonment of no rollback property while minimizing additional communication cost resulting from the RSN replication by effectively utilizing the positive feature of broadcast networks. The analysis results show our protocol has the great potentiality of cutting down network stress in terms of the number of control messages passing on the network.
Keywords :
protocols; workstation clusters; SBML protocol; behavioral feature; broadcast networks; cluster system; communication overhead; concurrent failures; control messages; nonfaulty processes; receive sequence number; rollback property; sender-based message logging; volatile memory; Broadcasting; Computer crashes; Protocols; Receivers; Redundancy; Scalability; broadcast network; cluster computing system; distributed system; fault-tolerance; message logging;
Conference_Titel :
Dependable, Autonomic and Secure Computing (DASC), 2014 IEEE 12th International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4799-5078-2
DOI :
10.1109/DASC.2014.13