An investigation of the performance of various dynamic scheduling techniques
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
Performance improvement with circuit-level speculation
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Parameter variations and impact on circuits and microarchitecture
Proceedings of the 40th annual Design Automation Conference
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Picking Statistically Valid and Early Simulation Points
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Exploiting Microarchitectural Redundancy For Defect Tolerance
ICCD '03 Proceedings of the 21st International Conference on Computer Design
Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Variation-tolerant circuits: circuit solutions and techniques
Proceedings of the 42nd annual Design Automation Conference
Rescue: A Microarchitecture for Testability and Defect Tolerance
Proceedings of the 32nd annual international symposium on Computer Architecture
Cascaded carry-select adder (C2SA): a new structure for low-power CSA design
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Mitigating the Impact of Process Variations on Processor Register Files and Execution Units
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
A new paradigm for low-power, variation-tolerant circuit synthesis using critical path isolation
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Tolerance to Small Delay Defects by Adaptive Clock Stretching
IOLTS '07 Proceedings of the 13th IEEE International On-Line Testing Symposium
Comparison of high-performance VLSI adders in the energy-delay space
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Device/circuit interactions at 22nm technology node
Proceedings of the 46th Annual Design Automation Conference
Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Considering the effect of process variations during the ISA extension design flow
Microprocessors & Microsystems
Hi-index | 0.00 |
Pipelined processor cores are conventionally designed to accommodate the critical paths in the critical pipeline stage(s) in a single clock cycle, to ensure correctness. Such conservative design is wasteful in many cases since critical paths are rarely exercised. Thus, configuring the pipeline to operate correctly for rarely used critical paths targets the uncommon case instead of optimizing for the common case. In this study, we describe Trifecta--an architectural technique that completes common-case, subcritical path operations in a single cycle but uses two cycles when the critical path is exercised. This increases slack for both single- and two-cycle operations and offers a unique advantage under process variation. In contrast with existing mechanisms that trade power or performance for yield, Trifecta improves the yield while preserving performance and power. We applied this technique to the critical pipeline stages of a superscalar out-of-order (OoO) and a single issue in-order processor, namely instruction issue and execute, respectively. Our experiments show that the rare two-cycle operations result in a small decrease (5% for integer and 2% for floating-point benchmarks of SPEC2000) in instructions per cycle. However, the increased delay slack causes an improvement in yield-adjusted-throughput by 20% (12.7%) for an in-order (InO) processor configuration.