• DocumentCode
    3484352
  • Title

    Execution-driven simulation of error recovery techniques for multicomputers

  • Author

    Frazier, Tiffany M. ; Tamir, Yuval

  • Author_Institution
    Dept. of Comput. Sci., California Univ., Los Angeles, CA, USA
  • fYear
    1997
  • fDate
    7-9 Apr 1997
  • Firstpage
    4
  • Lastpage
    13
  • Abstract
    DERT (Distributed Error Recovery Testbed) is a testbed for simulation and performance evaluation of several classes of application-transparent distributed error recovery schemes. DERT is built on top of an event-driven, message-passing, object-oriented, multithreaded simulation kernel. Actual compiled distributed applications are instrumented for data collection and executed on the simulated multicomputer. Checkpointing is implemented in full detail, including associated overhead per message, additional messages, and changes to the memory system. DERT allows easy modification of a wide variety of system parameters, thus offering a level of flexibility not easily achieved by experimentation on a particular real machine. This paper describes the design, functionality, and performance of DERT. The main problems encountered in DERT´s development are discussed, as well as examples of its use in evaluating recovery schemes
  • Keywords
    discrete event simulation; distributed memory systems; message passing; object-oriented programming; performance evaluation; system recovery; checkpointing; distributed error recovery testbed; error recovery techniques; execution-driven simulation; functionality; message passing; multicomputers; multithreaded simulation kernel; object-oriented kernel; performance evaluation; recovery schemes; simulated multicomputer; Application software; Checkpointing; Computational modeling; Computer errors; Fault tolerant systems; Instruments; Kernel; Object oriented modeling; Performance analysis; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Simulation Symposium, 1997. Proceedings., 30th Annual
  • Conference_Location
    Atlanta, GA
  • ISSN
    1080-241X
  • Print_ISBN
    0-8186-7934-4
  • Type

    conf

  • DOI
    10.1109/SIMSYM.1997.586449
  • Filename
    586449