• DocumentCode
    1929206
  • Title

    A Framework for Robust HLA-based Distributed Simulations

  • Author

    Chen, Dan ; Turner, Stephen John ; Cai, Wentong

  • Author_Institution
    University of Birmingham, UK
  • fYear
    2006
  • fDate
    2006
  • Firstpage
    183
  • Lastpage
    192
  • Abstract
    The High Level Architecture (HLA) is a standard for the interoperability and reuse of simulation components, referred to as federates. Large scale HLA-compliant simulations are built to study complex problems, and they often involve a large number of federates and vast computing resources. Simulation federates running at different locations are liable to failure. The failure of one federate can lead to the crash of the overall simulation execution. Such risk increases with the scale of a distributed simulation. Hence, fault-tolerance is required to support runtime robustness. This paper introduces a framework for robust HLAbased distributed simulations using a “Decoupled Federate Architecture”. Our framework exploits the architecture to provide a generic fault-tolerant model, that exploits a “dynamic substitution” approach to deal with failure. A sender-based method is designed to ensure reliable in-transit message delivery, which is coupled with a novel algorithm to perform effective fossil collection. The fault-tolerant model also avoids any unnecessary repeated computation when handling failure. The framework supports reusability of legacy federate code, and it is platform-neutral and independent of federate modeling approaches. Experiments have been carried out to validate and benchmark the fault-tolerant federates using an example of a simple supply-chain simulation. The experimental results show that the framework provides correct failure recovery and indicate that the overhead for facilitating fault-tolerance is minimal.
  • Keywords
    Computational modeling; Computer architecture; Computer crashes; Discrete event simulation; Fault tolerance; Internet; Large-scale systems; Military computing; Robustness; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Principles of Advanced and Distributed Simulation, 2006. PADS 2006. 20th Workshop on
  • Conference_Location
    Singapore
  • ISSN
    1087-4097
  • Print_ISBN
    0-7695-2587-3
  • Type

    conf

  • DOI
    10.1109/PADS.2006.7
  • Filename
    1630730