DocumentCode :
2540388
Title :
Fault-tolerant grid services using primary-backup: feasibility and performance
Author :
Zhang, Xianan ; Zagorodnov, Dmitrii ; Hiltunen, Matti ; Marzullo, Keith ; Schlichting, Richard D.
Author_Institution :
California Univ., San Diego, CA, USA
fYear :
2004
fDate :
20-23 Sept. 2004
Firstpage :
105
Lastpage :
114
Abstract :
The combination of grid technology and Web services has produced an attractive platform for deploying distributed applications: grid services, as represented by the Open Grid Services Infrastructure (OGSI) and its Globus toolkit implementation. As the use of grid services grows in popularity, tolerating failures becomes increasingly important. This work addresses the problem of building a reliable and highly-available grid service by replicating the service on two or more hosts using the primary-backup approach. The primary goal is to evaluate the ease and efficiency with which this can be done, by first designing a primary-backup protocol using OGSI, and then implementing it using Globus to evaluate performance implications and tradeoffs. We compared three implementations: one that makes heavy use of the notification interface defined in OGSI, one that uses standard grid service requests instead of notification, and one that uses low-level socket primitives. The overall conclusion is that, while the performance penalty of using Globus primitives - especially notification - for replica coordination can be significant, the OGSI model is suitable for building highly-available services and it makes the task of engineering such services easier.
Keywords :
Internet; fault tolerant computing; grid computing; open systems; performance evaluation; protocols; Globus toolkit; OGSI; Open Grid Services Infrastructure; Web services; distributed applications; fault-tolerant grid services; grid technology; low-level socket primitives; notification interface; primary-backup approach; primary-backup protocol; replica coordination; Buildings; Computer crashes; Databases; Distributed computing; Fault tolerance; Grid computing; Hardware; Laboratories; Sockets; Web services;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing, 2004 IEEE International Conference on
ISSN :
1552-5244
Print_ISBN :
0-7803-8694-9
Type :
conf
DOI :
10.1109/CLUSTR.2004.1392608
Filename :
1392608
Link To Document :
بازگشت