• DocumentCode
    26999
  • Title

    Scalable Analytics for IaaS Cloud Availability

  • Author

    Ghosh, Rajesh ; Longo, Federica ; Frattini, Flavio ; Russo, S. ; Trivedi, Kishor S.

  • Author_Institution
    IBM, Durham, NC, USA
  • Volume
    2
  • Issue
    1
  • fYear
    2014
  • fDate
    Jan.-March 2014
  • Firstpage
    57
  • Lastpage
    70
  • Abstract
    In a large Infrastructure-as-a-Service (IaaS) cloud, component failures are quite common. Such failures may lead to occasional system downtime and eventual violation of Service Level Agreements (SLAs) on the cloud service availability. The availability analysis of the underlying infrastructure is useful to the service provider to design a system capable of providing a defined SLA, as well as to evaluate the capabilities of an existing one. This paper presents a scalable, stochastic model-driven approach to quantify the availability of a large-scale IaaS cloud, where failures are typically dealt with through migration of physical machines among three pools: hot (running), warm (turned on, but not ready), and cold (turned off). Since monolithic models do not scale for large systems, we use an interacting Markov chain based approach to demonstrate the reduction in the complexity of analysis and the solution time. The three pools are modeled by interacting sub-models. Dependencies among them are resolved using fixed-point iteration, for which existence of a solution is proved. The analytic-numeric solutions obtained from the proposed approach and from the monolithic model are compared. We show that the errors introduced by interacting sub-models are insignificant and that our approach can handle very large size IaaS clouds. The simulative solution is also considered for the proposed model, and solution time of the methods are compared.
  • Keywords
    Markov processes; cloud computing; contracts; iterative methods; system monitoring; IaaS cloud availability; Markov chain based approach; SLA; analytic-numeric solutions; cloud service availability; component failures; fixed-point iteration; infrastructure-as-a-service cloud; large-scale IaaS cloud; monolithic models; physical machines; scalable analytics; service level agreements; service provider; stochastic model-driven approach; system downtime; Analytical models; Cloud computing; Computational modeling; Failure analysis; Maintenance engineering; Markov processes; Analytic-numeric solution; availability; cloud computing; downtime; simulation; stochastic reward nets;
  • fLanguage
    English
  • Journal_Title
    Cloud Computing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-7161
  • Type

    jour

  • DOI
    10.1109/TCC.2014.2310737
  • Filename
    6762970