• DocumentCode
    2378004
  • Title

    A generic availability model for clustered computing systems

  • Author

    Sun, Hairong ; Han, Jame J. ; Levendel, Haim

  • Author_Institution
    High Availability & Reliability Technol. Center, Motorola, Elk Grove Village, IL, USA
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    241
  • Lastpage
    248
  • Abstract
    We study the availability of a clustered computing system with one cluster manager and "N+M" processing nodes, where M processing nodes serve as spares for the N active processing nodes. The functionality of an individual processing node is dissected into application software, management software, OS and hardware. The dependency among these entities is considered. Stochastic Petri net models are constructed to investigate the cluster availability. In order to deal with a cluster of a very large size, a solution based on state aggregation and fixed-point iteration is proposed. The existence and uniqueness of the fixed point is proved. The impact of a cluster manager, switchover time and coverage ratio are quantitatively studied. From the numerical results of a simple cluster with "2+1" processing nodes, we find that: (1) the availability of the cluster manager does not have a significant impact on the system availability, (2) system availability increases with the coverage ratio and decreases with the switchover time. Mechanisms to improve the system availability are discussed
  • Keywords
    Petri nets; fault tolerant computing; stochastic processes; workstation clusters; active processing nodes; application software; cluster availability; cluster manager; clustered computing systems; coverage ratio; fixed point iteration; generic availability model; management software; processing nodes; state aggregation; stochastic Petri net models; switchover time; system availability; uniqueness; Application software; Availability; Computer architecture; Concurrent computing; Distributed computing; Hardware; Operating systems; Stochastic processes; Sun; Technology management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Computing, 2001. Proceedings. 2001 Pacific Rim International Symposium on
  • Conference_Location
    Seoul
  • Print_ISBN
    0-7695-1414-6
  • Type

    conf

  • DOI
    10.1109/PRDC.2001.992704
  • Filename
    992704