• DocumentCode
    2843634
  • Title

    Building a High Serviceability Model by Checkpointing and Replication Strategy in Cloud Computing Environments

  • Author

    Sun, Dawei ; Chang, Guiran ; Miao, Changsheng ; Wang, Xingwei

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
  • fYear
    2012
  • fDate
    18-21 June 2012
  • Firstpage
    578
  • Lastpage
    587
  • Abstract
    High fault tolerance issue is one of the major obstacles for opening up the new era of high serviceability cloud computing as fault tolerance plays a key role in order to ensure cloud serviceability. In most current clouds, check pointing, the process of saving application states, and replication, the process of replicating hot data, usually to stable storage, have been the two most common fault tolerance strategies. However, when, where, and how often to insert check pointing or to replicate hot data have become challenges and are ignored in clouds. In this paper, the definitions of fault, error, and failure in a cloud are given, a high serviceability model by check pointing and replication strategy HSCR is put forward. It includes: (1) analyzing the mathematical relationship between different failure rates and two different fault tolerance strategies, which are check pointing fault tolerance strategy and data replication fault tolerance strategy, (2) building a high serviceability check pointing fault tolerance model and a high serviceability replication fault tolerance model by combining the two fault tolerance models together to maximize the serviceability and meet the SLOs. Experimental results conclusively demonstrate that the high serviceability model HSCR has high potential as it provides efficient fault tolerance enhancements, significant cloud serviceability improvement, and great SLOs satisfaction.
  • Keywords
    checkpointing; cloud computing; software fault tolerance; HSCR; SLO satisfaction; checkpointing; cloud computing environment; cloud serviceability improvement; failure rate; fault tolerance; hot data replication; mathematical relationship; replication strategy; serviceability model; stable storage; Analytical models; Checkpointing; Cloud computing; Computational modeling; Density functional theory; Fault tolerance; Fault tolerant systems; checkpointing model; cloud computing; high serviceability; replication model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems Workshops (ICDCSW), 2012 32nd International Conference on
  • Conference_Location
    Macau
  • ISSN
    1545-0678
  • Print_ISBN
    978-1-4673-1423-7
  • Type

    conf

  • DOI
    10.1109/ICDCSW.2012.6
  • Filename
    6258208