• DocumentCode
    1699307
  • Title

    Applying simulation to the design and performance evaluation of fault-tolerant systems

  • Author

    Alvarez, Guillermo A. ; Cristian, Flaviu

  • Author_Institution
    California Univ., San Diego, La Jolla, CA, USA
  • fYear
    1997
  • Firstpage
    35
  • Lastpage
    42
  • Abstract
    The paper illustrates how the CESIUM simulation tool can be used for design and performance evaluation of fault tolerant and real time systems, in addition to testing the correctness of protocol implementations. We calibrate three increasingly accurate simulation models of a network of workstations using independently obtained data. For a sample group membership protocol, the predictions of the simulator are very close to the actual performance measured in the real system. We also apply CESIUM to the evaluation of two potential improvements for the protocol, performing experiments that would have been difficult to implement in the real system. The results of the simulations give us valuable insight on how to tune configuration parameters, as well as on the performance gains of the improved versions. Our experience shows that CESIUM can be used to develop best effort services which adapt their quality of service according to the failures that occur during operation
  • Keywords
    digital simulation; fault tolerant computing; performance evaluation; protocols; reliability; software fault tolerance; telecommunication computing; virtual machines; CESIUM simulation tool; accurate simulation models; best effort services; configuration parameter tuning; correctness testing; fault tolerant systems; network of workstations; performance evaluation; performance gains; protocol implementations; quality of service; real time systems; sample group membership protocol; Debugging; Engines; Fault tolerance; Fault tolerant systems; Java; Object oriented modeling; Performance evaluation; Protocols; Software testing; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems, 1997. Proceedings., The Sixteenth Symposium on
  • Conference_Location
    Durham, NC
  • ISSN
    1060-9857
  • Print_ISBN
    0-8186-8177-2
  • Type

    conf

  • DOI
    10.1109/RELDIS.1997.632794
  • Filename
    632794