DocumentCode :
2719677
Title :
Disaster tolerant Wolfpack geo-clusters
Author :
Wilkins, Richard S. ; Du, Xing ; Cochran, Robert A. ; Popp, Matthias
Author_Institution :
Hewlett-Packard Co., Bellevue, WA, USA
fYear :
2002
fDate :
2002
Firstpage :
222
Lastpage :
227
Abstract :
Clustering of computer systems to increase application availability has become a common industry practice. While it does increase the availability of applications and their data to users, it does not solve the problem of a disaster (flood, tornado, earthquake, terrorism, civil unrest, etc.) making the entire cluster, and the applications and data it is serving, unavailable. Distance mirroring of an application\´s data store allows for recovery from disaster but may still result in long periods of unacceptable downtime. This paper describes a method for stretching a standard Wolfpack (Microsoft Cluster Service, MSCS) cluster of Intel architecture servers geographically for disaster tolerance. Server nodes and their storage may be placed at two (or more) distant sites to prevent a single disaster from taking down the entire cluster. Standard cluster semantics and ease of use are maintained using the remote mirroring capabilities of Hewlett-Packard\´s high-end storage arrays. The design of additional software to control data mirroring behavior when moving or failing-over applications between server nodes is described. Also, software that allows "stretching" the cluster quorum disk between sites in a manner that is transparent to the cluster software and also software for an external arbitrator node that provides rapid recovery from total loss of inter-site communications is described. Flexibility provided by the array\´s firmware mirroring options (i.e. synchronous or asynchronous I/O mirroring) allows for optimum use of inter-site link bandwidth based on the data safety requirements of individual applications.
Keywords :
distributed processing; fault tolerant computing; system recovery; workstation clusters; Wolfpack; cluster quorum disk; cluster semantics; clustering; data mirroring; disaster recovery; disaster tolerance; remote mirroring; Application software; Bandwidth; Computer industry; Earthquakes; Floods; Microprogramming; Safety; Software design; Terrorism; Tornadoes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2002. Proceedings. 2002 IEEE International Conference on
Print_ISBN :
0-7695-2066-9
Type :
conf
DOI :
10.1109/CLUSTR.2002.1137750
Filename :
1137750
Link To Document :
بازگشت