DocumentCode :
3177921
Title :
Fault recovery mechanism for multiprocessor servers
Author :
Masubuchi, Y. ; Hoshina, S. ; Shimada, T. ; Hirayama, B. ; Kato, N.
Author_Institution :
Inf. & Commun. Syst. Lab., Toshiba Corp., Tokyo, Japan
fYear :
1997
fDate :
24-27 June 1997
Firstpage :
184
Lastpage :
193
Abstract :
Achieving higher reliability in open server computer systems with low cost has been an increasing interest recently. To satisfy this general demand, we propose a new fault recovery mechanism. We extended the recovery cache scheme to adapt to state-of-the-art multiprocessor server computer systems, and built a system level fault recovery mechanism. It enables the system to recover from most intermittent hardware errors without rebooting the system. Furthermore, faulty processors can be isolated dynamically, and not only hardware errors but also many of operating system panics caused by unanticipated software errors can be recovered. The fault recovery mechanism is implemented with the "add-on" hardware module and controlling software module and fully transparent to application programs. Thus no modification is required to the basic hardware and binary compatibility is maintained which is mandatory for open systems. System performance was evaluated using TPC-C benchmark. We also built an experimental system with prototype hardware.
Keywords :
computer network reliability; multiprocessing systems; open systems; system recovery; TPC-C benchmark; binary compatibility; fault recovery mechanism; intermittent hardware errors; multiprocessor servers; open server computer systems; recovery cache scheme; reliability; system level fault recovery mechanism; unanticipated software errors; Application software; Computer errors; Computer network reliability; Computer networks; Costs; Fault tolerance; Hardware; Maintenance; Network servers; Operating systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1997. FTCS-27. Digest of Papers., Twenty-Seventh Annual International Symposium on
Conference_Location :
Seattle, WA, USA
ISSN :
0731-3071
Print_ISBN :
0-8186-7831-3
Type :
conf
DOI :
10.1109/FTCS.1997.614091
Filename :
614091
Link To Document :
بازگشت