Error Detection and Handling in a Superscalar, Speculative Out-of-Order Execution Processor System

  • Authors:
  • H. Osone;S. Thusoo;A. Dharmaraj;B. Chia

  • Affiliations:
  • -;-;-;-

  • Venue:
  • FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: The HaL SPARC64 Processor, the first 64-bit SPARC-V9 architecture implementation, uses several techniques to ensure a high degree of system reliability, error detection, and error recovery. The CPU of the multi-chip module processor has a superscalar, speculative issue unit, and an out-of-order execution datapath. These two processor components complicate the maintenance of precise state in the event of errors. By exploiting the SPARC-V9 architectural features, and the micro-architecture for speculative execution, SPARC64 maintains precise state in the event of exceptions and errors, logs and reports errors, and facilitates error detection during full system bringup. The paper presents details of error detection and handling in the CPU, the cache system, and the Memory Management Unit(MMU). The HaL R1 system also implements a fault-secure memory system design. The memory system corrects all single-bit errors, detects double bit errors, detects single address line failures, and detects all single dynamic RAM (DRAM) chip failures. Certain debug features have been added to the system that are useful during system bring-up.