Viper: virtual pipelines for enhanced reliability

Authors:
Andrea Pellegrini;Joseph L. Greathouse;Valeria Bertacco
Affiliations:
University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI
Venue:
Proceedings of the 39th Annual International Symposium on Computer Architecture
Year:
2012

Citing 33
Cited 1

Multiscalar processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
DIVA: a reliable substrate for deep submicron microarchitecture design

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
SafetyNet: improving the availability of shared memory multiprocessors with global checkpoint/recovery

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs

IEEE Micro
The AMD Opteron Processor for Multiprocessor Servers

IEEE Micro
Exploiting Microarchitectural Redundancy For Defect Tolerance

ICCD '03 Proceedings of the 21st International Conference on Computer Design
WaveScalar

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Scaling to the End of Silicon with EDGE Architectures

Computer
Tolerating Hard Faults in Microprocessor Array Structures

DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Commercial Fault Tolerance: A Tale of Two Systems

IEEE Transactions on Dependable and Secure Computing
Reliability Wearout Mechanisms in Advanced CMOS Technologies

Reliability Wearout Mechanisms in Advanced CMOS Technologies
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Montecito: A Dual-Core, Dual-Thread Itanium Processor

IEEE Micro
NonStop® Advanced Architecture

DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Designing Reliable Systems from Unreliable Components: The Challenges of Transistor Variability and Degradation

IEEE Micro
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Ultra low-cost defect protection for microprocessor pipelines

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
SPEC CPU2006 benchmark descriptions

ACM SIGARCH Computer Architecture News
Approaching Ideal NoC Latency with Pre-Configured Routes

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Thousand core chips: a technology perspective

Proceedings of the 44th annual Design Automation Conference
Bringing NoCs to 65 nm

IEEE Micro
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Understanding the propagation of hard errors to software and implications for resilient system design

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Larrabee: a many-core x86 architecture for visual computing

ACM SIGGRAPH 2008 papers
The StageNet fabric for constructing resilient multicore systems

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Architectural core salvaging in a multi-core processor for hard-error tolerance

Proceedings of the 36th annual international symposium on Computer architecture
Vicis: a reliable network for unreliable silicon

Proceedings of the 46th Annual Design Automation Conference
Cycles, cells and platters: an empirical analysisof hardware failures on a million consumer PCs

Proceedings of the sixth conference on Computer systems
The gem5 simulator

ACM SIGARCH Computer Architecture News
Application-aware diagnosis of runtime hardware faults

Proceedings of the International Conference on Computer-Aided Design
Sparc T4: A Dynamically Threaded Server-on-a-Chip

IEEE Micro
Cardio: Adaptive CMPs for reliability through dynamic introspective operation

HLDVT '11 Proceedings of the 2011 IEEE International High Level Design Validation and Test Workshop

uDIREC: unified diagnosis and reconfiguration for frugal bypass of NoC faults

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

The reliability of future processors is threatened by decreasing transistor robustness. Current architectures focus on delivering high performance at low cost; lifetime device reliability is a secondary concern. As the rate of permanent hardware faults increases, robustness will become a first class constraint for even low-cost systems. Current research into reliable architectures has focused on ad-hoc solutions to improve designs without altering their centralized control logic. Unfortunately, this centralized control presents a single point of failure, which limits long-term robustness. To address this issue, we introduce Viper, an architecture built from a redundant collection of fine-grained hardware components. Instructions are perceived as customers that require a sequence of services in order to properly execute. The hardware components vie to perform what services they can, dynamically forming virtual pipelines that avoid defective hardware. This is done using distributed control logic, which avoids a single point of failure by construction. Viper can tolerate a high number of permanent faults due to its inherent redundancy. As fault counts increase, its performance degrades more gracefully than traditional centralized-logic architectures. We estimate that fault rates higher than one permanent faults per 12 million transistors, on average, cause the throughput of a classic CMP design to fall below that of a Viper design of similar size.