DocumentCode :
2943664
Title :
Application Cluster Service Scheme for Near-Zero-Downtime Services
Author :
Cheng, Fan-tien ; Wu, Shang-Lun ; Tsai, Ping-Yen ; Chung, Yun-Ta ; Yang, Haw-Ching
Author_Institution :
Institute of Manufacturing Engineering National Cheng Kung University Tainan, Taiwan, R.O.C., e-mail: chengft@mail.ncku.edu.tw
fYear :
2005
fDate :
18-22 April 2005
Firstpage :
4062
Lastpage :
4067
Abstract :
The required reliability in applications of a distributed computer system is continuous service for 24 hours a day, 7 days a week. However, computer failures due to exhaustion of operating system resources, data corruption, numerical error accumulation, and so on, may interrupt services and cause significant losses. Hence, this work proposes an application cluster service (APCS) scheme. The proposed APCS provides both a failover scheme and a state recovery scheme for failure management. The failover scheme is designed mainly to automatically activate the backup application for replacing the failed application whenever it is sick or down. Meanwhile, the state recovery scheme is intended primarily to provide an inheritable design pattern to support applications with state recovery requirements. An application simply needs to inherit and implement this design pattern, and then can accomplish the task of state backup and recovery. Furthermore, a performance evaluator (PEV) that can detect performance degradation and predict time to failure is developed in this study. By using these detection and prediction capabilities, the APCS can perform the failover process before node breakdown. Thus, applying APCS and PEV can enable a distributed computer system to provide services with near-zero-downtime.
Keywords :
Application cluster service (APCS); failover scheme; near-zero-downtime service; performance evaluator (PEV); state recovery scheme; Application software; Availability; Computer errors; Control engineering; Degradation; Distributed computing; Manufacturing; Middleware; Operating systems; Reliability engineering; Application cluster service (APCS); failover scheme; near-zero-downtime service; performance evaluator (PEV); state recovery scheme;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on
Print_ISBN :
0-7803-8914-X
Type :
conf
DOI :
10.1109/ROBOT.2005.1570743
Filename :
1570743
Link To Document :
بازگشت