DocumentCode :
3334083
Title :
Error detection and handling in a superscalar, speculative out-of-order execution processor system
Author :
Saxena, N. ; Chien Chen ; Swami, R. ; Osone, H. ; Thusoo, S. ; Lyon, D. ; Chang, D. ; Dharmaraj, A. ; Patkar, N. ; Lu, Y. ; Chia, B.
Author_Institution :
HaL Comput. Syst., Campbell, CA, USA
fYear :
1995
fDate :
27-30 June 1995
Firstpage :
464
Lastpage :
471
Abstract :
The HaL SPARC64 Processor, the first 64-bit SPARC-V9 architecture implementation, uses several techniques to ensure a high degree of system reliability, error detection, and error recovery. The CPU of the multi-chip module processor has a superscalar, speculative issue unit, and an out-of-order execution datapath. These two processor components complicate the maintenance of precise state in the event of errors. By exploiting the SPARC-V9 architectural features, and the micro-architecture for speculative execution, SPARC64 maintains precise state in the event of exceptions and errors, logs and reports errors, and facilitates error detection during full system bringup. The paper presents details of error detection and handling in the CPU, the cache system, and the Memory Management Unit(MMU). The HaL R1 system also implements a fault-secure memory system design. The memory system corrects all single-bit errors, detects double bit errors, detects single address line failures, and detects all single dynamic RAM (DRAM) chip failures. Certain debug features have been added to the system that are useful during system bring-up.<>
Keywords :
DRAM chips; computer architecture; computer debugging; digital storage; error detection; error handling; fault tolerant computing; multichip modules; program debugging; reliability; software fault tolerance; storage units; system recovery; 64-bit SPARC-V9 architecture implementation; CPU; HaL R1; HaL SPARC64 Processor; Memory Management Unit; cache system; error detection; error handling; error logging; error recovery; error reporting; fault-secure memory system design; full system bringup; micro-architecture; multi-chip module processor; out-of-order execution datapath; precise state maintenance; processor components; speculative execution; superscalar speculative issue unit; superscalar speculative out-of-order execution processor system; system reliability; Aerospace industry; Availability; Computer architecture; Computer errors; Error correction; Memory management; Out of order; Random access memory; Reliability; Workstations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on
Conference_Location :
Pasadena, CA, USA
Print_ISBN :
0-8186-7079-7
Type :
conf
DOI :
10.1109/FTCS.1995.466952
Filename :
466952
Link To Document :
بازگشت