Using Field-Repairable Control Logic to Correct Design Errors in Microprocessors

  • Authors:
  • I. Wagner;V. Bertacco;T. Austin

  • Affiliations:
  • Dept. of Electr. Eng. & Comput. Sci., Michigan Univ., Ann Arbor, MI;-;-

  • Venue:
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.03

Visualization

Abstract

Functional correctness is a vital attribute of any hardware design. Unfortunately, due to extremely complex architectures, widespread components, such as microprocessors, are often released with latent bugs. The inability of modern verification tools to handle the fast growth of design complexity exacerbates the problem even further. In this paper, we propose a novel hardware-patching mechanism, called the field-repairable control logic (FRCL), that is designed for in-the-field correction of errors in the design's control logic-the most common type of defects, as our analysis demonstrates. Our solution introduces an additional component in the processor's hardware, a state matcher, that can be programmed to identify erroneous configurations using signals in the critical control state of the processor. Once a flawed configuration is ldquomatched,rdquo the processor switches into a degraded mode, a mode of operation which excludes most features of the system and is simple enough to be formally verified, yet still capable to execute the full instruction-set architecture at one instruction at a time. Once the program segment exposing the design flaw has been executed in a degraded mode, we can switch the processor back to its full-performance mode. In this paper, we analyze a range of approaches to selecting signals comprising the processor's critical control state and evaluate their effectiveness in representing a variety of design errors. We also introduce a new metric (average specificity per signal) that encodes the bug-detection capability and amount of control state of a particular critical signal set. We demonstrate that the FRCL can support the detection and correction of multiple design errors with a performance impact of less than 5% as long as the incidence of the flawed configurations is below 1% of dynamic instructions. In addition, the area impact of our solution is less than 2% for the two microprocessor designs that we investigated in our experiments.