Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The performance impact of incomplete bypassing in processor pipelines
Proceedings of the 28th annual international symposium on Microarchitecture
IBM Journal of Research and Development - Special issue: terrestrial cosmic rays and soft errors
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 24th annual international symposium on Computer architecture
Complexity-effective superscalar processors
Proceedings of the 24th annual international symposium on Computer architecture
Understanding the differences between value prediction and instruction reuse
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
On pipelining dynamic instruction scheduling logic
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Transient-fault recovery using simultaneous multithreading
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Select-free instruction scheduling logic
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Increasing Instruction-Level Parallelism with Instruction Precomputation (Research Note)
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
On Some Implementation Issues for Value Prediction on Wide-Issue ILP Processors
PACT '00 Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Complexity-effective superscalar processors
Complexity-effective superscalar processors
Instruction Replication: Reducing Delays Due to Inter-PE Communication Latency
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Instruction Replication for Clustered Microarchitectures
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Reducing the Scheduling Critical Cycle Using Wakeup Prediction
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Opportunistic Transient-Fault Detection
Proceedings of the 32nd annual international symposium on Computer Architecture
Exploiting Coarse-Grain Verification Parallelism for Power-Efficient Fault Tolerance
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Opportunistic Transient-Fault Detection
IEEE Micro
Self-checking instructions: reducing instruction redundancy for concurrent error detection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
SlicK: slice-based locality exploitation for efficient redundant multithreading
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Mechanisms for bounding vulnerabilities of processor structures
Proceedings of the 34th annual international symposium on Computer architecture
Dynamic prediction of architectural vulnerability from microarchitectural state
Proceedings of the 34th annual international symposium on Computer architecture
Transient fault prediction based on anomalies in processor events
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
A Decimal Floating-Point Divider Using Newton---Raphson Iteration
Journal of VLSI Signal Processing Systems
Exploiting selective placement for low-cost memory protection
ACM Transactions on Architecture and Code Optimization (TACO)
Compiler-assisted soft error detection under performance and energy constraints in embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
Architecture Design for Soft Errors
Architecture Design for Soft Errors
Instruction precomputation with memoization for fault detection
Proceedings of the Conference on Design, Automation and Test in Europe
Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
On the exploitation of narrow-width values for improving register file reliability
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Resource-Driven optimizations for transient-fault detecting superscalar microarchitectures
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
A survey of checker architectures
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Previous proposals for implementing instruction-level temporalredundancy in out-of-order cores have reported a performancedegradation of upto 45% in certain applications compared to anexecution which does not have any temporal redundancy. An importantcontributor to this problem is the insufficient number ofALUs for handling the amplified load injected into the core. At thesame time, increasing the number of ALUs can increase the complexityof the issue logic, which has been pointed out to be oneof the most timing critical components of the processor. This paperproposes a novel extension of a prior idea on instruction reuseto ease ALU bandwidth requirements in a complexity-effective wayby exploiting certain interesting properties of a dual (temporallyredundant) instruction stream. We present microarchitectural extensionsnecessary for implementing an instruction reuse buffer(IRB) and integrating this with the issue logic of a dual instructionstream superscalar core, and conduct extensive evaluationsto demonstrate how well it can alleviate the ALU bandwidth problem.We show that on the average we can gain back nearly 50%of the IPC loss that occurred due to ALU bandwidth limitationsfor an instruction-level temporally redundant superscalar execution,and 23% of the overall IPC loss.