Quick detection of difficult bugs for effective post-silicon validation

  • Authors:
  • David Lin;Ted Hong;Farzan Fallah;Nagib Hakim;Subhasish Mitra

  • Affiliations:
  • Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA;Intel Corporation, Santa Clara, CA;Stanford University, Stanford, CA

  • Venue:
  • Proceedings of the 49th Annual Design Automation Conference
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new technique for systematically creating postsilicon validation tests that quickly detect bugs in processor cores and uncore components (cache controllers, memory controllers, on-chip networks) of multi-core System on Chips (SoCs). Such quick detection is essential because long error detection latency, the time elapsed between the occurrence of an error due to a bug and its manifestation as an observable failure, severely limits the effectiveness of existing post-silicon validation approaches. In addition, we provide a list of realistic bug scenarios abstracted from "difficult" bugs that occurred in commercial multi-core SoCs. Our results for an OpenSPARC T2-like multi-core SoC demonstrate: 1. Error detection latencies of "typical" post-silicon validation tests can be very long, up to billions of clock cycles, especially for bugs in uncore components. 2. Our new technique shortens error detection latencies by several orders of magnitude to only a few hundred cycles for most bug scenarios. 3. Our new technique enables 2-fold increase in bug coverage. An important feature of our technique is its software-only implementation without any hardware modification. Hence, it is readily applicable to existing designs.