Communications of the ACM - Special issue on computer architecture
Hardware/software tradeoffs for increased performance
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Design of a Computer—The Control Data 6600
Design of a Computer—The Control Data 6600
Planning a computer system: Project Stretch
Planning a computer system: Project Stretch
IEEE Transactions on Computers
High-Performance Fault-Tolerant VLSI Systems Using Micro Rollback
IEEE Transactions on Computers
The interaction of architecture and operating system design
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
ACM SIGARCH Computer Architecture News
The expandable split window paradigm for exploiting fine-grain parallelsim
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Instruction-level parallelism from execution interlock collapsing
ACM SIGARCH Computer Architecture News
An architectural framework for migration from CISC to higher performance platforms
ICS '92 Proceedings of the 6th international conference on Supercomputing
Y-Pipe: a conditional branching scheme without pipeline delays
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Interlock collapsing ALU for increased instruction-level parallelism
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
An out-of-order superscalar processor with speculative execution and fast, precise interrupts
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
The MC88110 implementation of precise exceptions in a superscalar architecture
ACM SIGARCH Computer Architecture News
Enhanced superscalar hardware: the schedule table
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
History cache: hardware support for reverse execution
ACM SIGARCH Computer Architecture News
The anatomy of the register file in a multiscalar processor
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Resource allocation in a high clock rate microprocessor
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Exploiting short-lived variables in superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
Dynamically scheduled VLIW processors
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
A comparision of superscalar and decoupled access/execute architectures
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Speculative execution via address prediction and data prefetching
ICS '97 Proceedings of the 11th international conference on Supercomputing
Proceedings of the 24th annual international symposium on Computer architecture
Micro-preemption synthesis: an enabling mechanism for multi-task VLSI systems
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
Selective eager execution on the PolyPath architecture
Proceedings of the 25th annual international symposium on Computer architecture
A look at several memory management units, TLB-refill mechanisms, and page table organizations
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
On the scheduling of variable latency functional units
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
Code transformations to improve memory parallelism
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Proceedings of the 38th annual Design Automation Conference
IEEE Micro
Interrupt Handling for Out-of-Order Execution Processors
IEEE Transactions on Computers
Compiler-Assisted Multiple Instruction Rollback Recovery Using a Read Buffer
IEEE Transactions on Computers
Error Recovery in Shared Memory Multiprocessors Using Private Caches
IEEE Transactions on Parallel and Distributed Systems
Weld: A Multithreading Technique Towards Latency-Tolerant VLIW Processors
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Speculative Sequential Consistency with Little Custom Storage
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Realizing high IPC through a scalable memory-latency tolerant multipath microarchitecture
ACM SIGARCH Computer Architecture News
Cherry: checkpointed early resource recycling in out-of-order microprocessors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Micronets: a model for decentralising control in asynchronous processor architectures
ASYNC '95 Proceedings of the 2nd Working Conference on Asynchronous Design Methodologies
Memory Faults in Asynchronous Microprocessors
ASYNC '99 Proceedings of the 5th International Symposium on Advanced Research in Asynchronous Circuits and Systems
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Tradeoffs in Buffering Memory State for Thread-Level Speculation in Multiprocessors
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Overcoming the limitations of conventional vector processors
Proceedings of the 30th annual international symposium on Computer architecture
Repairing return address stack for buffer overflow protection
Proceedings of the 1st conference on Computing frontiers
Reducing register pressure through LAER algorithm
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor
Proceedings of the 31st annual international symposium on Computer architecture
Tradeoffs in buffering speculative memory state for thread-level speculation in multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Memory State Compressors for Giga-Scale Checkpoint/Restore
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Incremental Commit Groups for Non-Atomic Trace Processing
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Decomposing memory performance: data structures and phases
Proceedings of the 5th international symposium on Memory management
BranchTap: improving performance with very few checkpoints through adaptive speculation control
Proceedings of the 20th annual international conference on Supercomputing
Architecture of a Self-Checkpointing Microprocessor that Incorporates Nanomagnetic Devices
IEEE Transactions on Computers
Error Recovery in Parallel Systems of Pipelined Processors with Caches
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Visual simulator for ILP dynamic OOO processor
WCAE '04 Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture
Building a large instruction window through ROB compression
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Exploiting virtual registers to reduce pressure on real registers
ACM Transactions on Architecture and Code Optimization (TACO)
Hiding the misprediction penalty of a resource-efficient high-performance processor
ACM Transactions on Architecture and Code Optimization (TACO)
Improving single-thread performance with fine-grain state maintenance
Proceedings of the 5th conference on Computing frontiers
Asymmetrically banked value-aware register files for low-energy and high-performance
Microprocessors & Microsystems
Reexecution and Selective Reuse in Checkpoint Processors
Transactions on High-Performance Embedded Architectures and Compilers II
Checkpoint allocation and release
ACM Transactions on Architecture and Code Optimization (TACO)
Formal Verification of Gate-Level Computer Systems
CSR '09 Proceedings of the Fourth International Computer Science Symposium in Russia on Computer Science - Theory and Applications
Architecture Design for Soft Errors
Architecture Design for Soft Errors
Virtual registers: reducing register pressure without enlarging the register file
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Turbo-ROB: a low cost checkpoint/restore accelerator
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Task superscalar: using processors as functional units
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Comparing FPGA vs. custom cmos and the impact on processor microarchitecture
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
CROB: implementing a large instruction window through compression
Transactions on high-performance embedded architectures and compilers III
Idempotent processor architecture
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Static analysis and compiler design for idempotent processing
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
iGPU: exception support and speculative execution on GPUs
Proceedings of the 39th Annual International Symposium on Computer Architecture
Journal of Parallel and Distributed Computing
Journal of Electronic Testing: Theory and Applications
Hi-index | 15.00 |
Five solutions to the precise interrupt problem in pipelined processors are described and evaluated. An interrupt is precise if the saved process state corresponds to a sequential model of program execution in which one instruction completes before the next begins. In a pipelined processor, precise interrupts are difficult to implement because an instruction may be initiated before its predecessors have completed. The first solution forces instructions to complete and modify the process state in architectural order. The other four solutions allow instructions to complete in any order, but additional hardware is used, so that a precise state can be restored when an interrupt occurs. All the methods are discussed in the context of a parallel pipeline structure. Simulation results for the Cray-1S scalar architecture are used to show that the first solution results in a performance degradation of at least 16%. The remaining four solutions offer better performance, and three of them result in as little as a 3% performance loss. Several extensions, including vector architectures, virtual memory, and linear pipeline structures, are briefly discussed.