• DocumentCode
    2191016
  • Title

    Introspection-Based Fault Tolerance for COTS-Based High-Capability Computation in Space

  • Author

    James, Mark L. ; Shapiro, Andrew A. ; Springer, Paul L. ; Zima, Hans P.

  • Author_Institution
    Jet Propulsion Lab., California Inst. of Technol., Pasadena, CA, USA
  • fYear
    2008
  • fDate
    21-23 Jan. 2008
  • Firstpage
    74
  • Lastpage
    83
  • Abstract
    Future missions of deep space exploration face the challenge of designing, building,and operating progressively more capable autonomous spacecraft and planetary rovers. Given the communication latencies and bandwidth limitations for such missions, the need for increased autonomy becomes mandatory, along with the requirement for enhanced on-board computational capabilities while in deep space or time-critical situations. This will result in dramatic changes in the way missions will be conducted and supported by on-board computing systems. Specifically, the traditional approach of relying exclusively on radiation-hardened hardware and modular redundancy will not be able to deliver the required computational power. As a consequence, such systems are expected to include high-capability low-power components based on emerging Commercial-Off-The-Shelf (COTS) multi-core technology. This paper describes the design of a generic framework for introspection that supports runtime monitoring and analysis of program execution as well as a feedback-oriented recovery from faults. One of the first applications of this framework will be to provide flexible software fault tolerance matched to the requirements and properties of applications by exploiting knowledge that is either contained in an application knowledge base, provided by users, or automatically derived from specifications. A prototype implementation is currently in progress at the Jet Propulsion Laboratory, California Institute of Technology, targeting a cluster of Cell Broadband Engines.
  • Keywords
    fault tolerance; multiprocessing systems; software packages; space vehicles; COTS; autonomous spacecraft; commercial-off-the-shelf multi-core technology; deep space exploration; feedback-oriented recovery; flexible software fault tolerance; high-capability computation; introspection-based fault tolerance; planetary rovers; runtime monitoring; Application software; Bandwidth; Delay; Fault tolerance; Hardware; Redundancy; Space exploration; Space missions; Space vehicles; Time factors; fault tolerance; high-performance computing; introspection; space-borne computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA), 2008 International Workshop on
  • Conference_Location
    Hilo, HI
  • ISSN
    1537-3223
  • Print_ISBN
    978-1-4244-6465-4
  • Electronic_ISBN
    1537-3223
  • Type

    conf

  • DOI
    10.1109/IWIA.2008.11
  • Filename
    5453556