Checkpoint repair for out-of-order execution machines

Authors:
W. W. Hwu;Y. N. Patt
Affiliations:
Computer Science Division, University of California, Berkeley, CA;Computer Science Division, University of California, Berkeley, CA
Venue:
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Year:
1987

Citing 7
Cited 45

HPSm, a high performance restricted data flow architecture having minimal functionality

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Reducing the cost of branches

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction

MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Implementation of precise interrupts in pipelined processors

ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Look-Ahead Processors

ACM Computing Surveys (CSUR)
Dependence graphs and compiler optimizations

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Design of a Computer—The Control Data 6600

Design of a Computer—The Control Data 6600

Aquarius

ACM SIGARCH Computer Architecture News
On tuning the microarchitecture of an HPS implementation of the VAX

ACM SIGMICRO Newsletter
High-Performance Fault-Tolerant VLSI Systems Using Micro Rollback

IEEE Transactions on Computers
OHMEGA: a VLSI superscalar processor architecture for numerical applications

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Exploiting fine-grained parallelism through a combination of hardware and software techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Efficient superscalar performance through boosting

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The MC88110 implementation of precise exceptions in a superscalar architecture

ACM SIGARCH Computer Architecture News
The effect of speculatively updating branch history on branch prediction accuracy, revisited

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
An investigation of the performance of various instruction-issue buffer topologies

Proceedings of the 28th annual international symposium on Microarchitecture
Exploiting short-lived variables in superscalar processors

Proceedings of the 28th annual international symposium on Microarchitecture
Handling floating-point exceptions in numeric programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
A comparative performance evaluation of various state maintenance mechanisms

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Register renaming and dynamic speculation: an alternative approach

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
On tuning the microarchitecture of an HPS implementation of the VAX

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Target prediction for indirect jumps

Proceedings of the 24th annual international symposium on Computer architecture
Alternative fetch and issue policies for the trace cache fetch mechanism

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Improving trace cache effectiveness with branch promotion and trace packing

Proceedings of the 25th annual international symposium on Computer architecture
Putting the fill unit to work: dynamic optimizations for trace cache microprocessors

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Evaluation of Design Options for the Trace Cache Fetch Mechanism

IEEE Transactions on Computers - Special issue on cache memory and related problems
Precise Interrupts

IEEE Micro
Cherry: checkpointed early resource recycling in out-of-order microprocessors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Error Detection and Handling in a Superscalar, Speculative Out-of-Order Execution Processor System

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Beating in-order stalls with "flea-flicker" two-pass pipelining

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Late Allocation and Early Release of Physical Registers

IEEE Transactions on Computers
A case for resource-conscious out-of-order processors: towards kilo-instruction in-flight processors

MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Toward kilo-instruction processors

ACM Transactions on Architecture and Code Optimization (TACO)
An analysis of a resource efficient checkpoint architecture

ACM Transactions on Architecture and Code Optimization (TACO)
Fast branch misprediction recovery in out-of-order superscalar processors

Proceedings of the 19th annual international conference on Supercomputing
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining

IEEE Transactions on Computers
An Integrated Framework for Dependable and Revivable Architectures Using Multicore Processors

Proceedings of the 33rd annual international symposium on Computer Architecture
BranchTap: improving performance with very few checkpoints through adaptive speculation control

Proceedings of the 20th annual international conference on Supercomputing
Implementing virtual memory in a vector processor with software restart markers

Proceedings of the 20th annual international conference on Supercomputing
Hardware support for early register release

International Journal of High Performance Computing and Networking
Future ILP processors

International Journal of High Performance Computing and Networking
Improving single-thread performance with fine-grain state maintenance

Proceedings of the 5th conference on Computing frontiers
An energy-efficient checkpointing mechanism for out of order commit processor

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Turbo-ROB: a low cost checkpoint/restore accelerator

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Idempotent processor architecture

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Complexity-Effective rename table design for rapid speculation recovery

ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Virtual register renaming

ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Tuning the continual flow pipeline architecture

Proceedings of the 27th international ACM conference on International conference on supercomputing
Virtual register renaming: energy efficient substrate for continual flow pipelines

Proceedings of the 23rd ACM international conference on Great lakes symposium on VLSI
Exploiting replicated checkpoints for soft error detection and correction

Proceedings of the Conference on Design, Automation and Test in Europe
Tuning the continual flow pipeline architecture with virtual register renaming

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Out-of-order execution and branch prediction are two mechanisms that can be used profitably in the design of Supercomputers to increase performance. Unfortunately this means there must be some kind of repair mechanism, since situations do occur that require the computing engine to repair to a known previous state. One way to handle this is by checkpoint repair. In this paper we derive several properties of checkpoint repair mechanisms. In addition, we provide algorithms for performing checkpoint repair that incur very little overhead in time and modest cost in hardware. We also note that our algorithms require no additional complexity or time for use with write back cache memory systems than they do with write through cache memory systems, contrary to statements made by previous researchers.