DocumentCode
3341179
Title
A resilient application-level failure detection system for distributed computing environments
Author
Welch, Bob ; Helal, Abdelsalam ; Elmasri, Ramez
Author_Institution
Dept. of Comput. Sci. Eng., Texas Univ., Arlington, TX, USA
fYear
1995
fDate
27-29 July 1995
Firstpage
401
Lastpage
406
Abstract
A methodology for detecting failures that occur in distributed computer systems connected by a communications network is described. The methodology utilizes active polling of monitored systems. The entities polled must be service entities that function at the application layers of service providing machines. A prototype system has been implemented to test this methodology.
Keywords
client-server systems; computer network management; computer network reliability; monitoring; open systems; protocols; system recovery; active polling; application layers; communications network; distributed computing environments; monitored systems; prototype system; resilient application-level failure detection system; service entities; service providing machines; Aging; Application software; Computer networks; Computer science; Computerized monitoring; Condition monitoring; Distributed computing; Network servers; Protocols; TCPIP;
fLanguage
English
Publisher
ieee
Conference_Titel
Computers and Communications, 1995. Proceedings., IEEE Symposium on
Conference_Location
Alexandria, Egypt
Print_ISBN
0-8186-7075-4
Type
conf
DOI
10.1109/SCAC.1995.523694
Filename
523694
Link To Document