• DocumentCode
    3341179
  • Title

    A resilient application-level failure detection system for distributed computing environments

  • Author

    Welch, Bob ; Helal, Abdelsalam ; Elmasri, Ramez

  • Author_Institution
    Dept. of Comput. Sci. Eng., Texas Univ., Arlington, TX, USA
  • fYear
    1995
  • fDate
    27-29 July 1995
  • Firstpage
    401
  • Lastpage
    406
  • Abstract
    A methodology for detecting failures that occur in distributed computer systems connected by a communications network is described. The methodology utilizes active polling of monitored systems. The entities polled must be service entities that function at the application layers of service providing machines. A prototype system has been implemented to test this methodology.
  • Keywords
    client-server systems; computer network management; computer network reliability; monitoring; open systems; protocols; system recovery; active polling; application layers; communications network; distributed computing environments; monitored systems; prototype system; resilient application-level failure detection system; service entities; service providing machines; Aging; Application software; Computer networks; Computer science; Computerized monitoring; Condition monitoring; Distributed computing; Network servers; Protocols; TCPIP;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computers and Communications, 1995. Proceedings., IEEE Symposium on
  • Conference_Location
    Alexandria, Egypt
  • Print_ISBN
    0-8186-7075-4
  • Type

    conf

  • DOI
    10.1109/SCAC.1995.523694
  • Filename
    523694