An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors
IEEE Transactions on Computers
HPSm, a high performance restricted data flow architecture having minimal functionality
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Highly concurrent scalar processing
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
HPS, a new microarchitecture: rationale and introduction
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Critical issues regarding HPS, a high performance microarchitecture
MICRO 18 Proceedings of the 18th annual workshop on Microprogramming
Instruction issue logic for high-performance, interruptable pipelined processors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
WISQ: a restartable architecture using queues
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Architectural tradeoffs in the design of MIPS-X
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Checkpoint repair for high-performance out-of-order execution machines
IEEE Transactions on Computers
Implementing Precise Interrupts in Pipelined Processors
IEEE Transactions on Computers
The performance potential of multiple functional unit processors
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Characterizing computer performance with a single number
Communications of the ACM
ACM Computing Surveys (CSUR)
Communications of the ACM - Special issue on computer architecture
Hardware/software tradeoffs for increased performance
ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A study of branch prediction strategies
ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
The Evolution of Instruction Sequencing
Computer - Special issue on instruction sequencing
High-bandwidth data memory systems for superscalar processors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
OHMEGA: a VLSI superscalar processor architecture for numerical applications
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An architectural framework for migration from CISC to higher performance platforms
ICS '92 Proceedings of the 6th international conference on Supercomputing
An out-of-order superscalar processor with speculative execution and fast, precise interrupts
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced superscalar hardware: the schedule table
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A fill-unit approach to multiple instruction issue
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Direct-mapped versus set-associative pipelined caches
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Zero-cycle loads: microarchitecture support for reducing load latency
Proceedings of the 28th annual international symposium on Microarchitecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A study on the number of memory ports in multiple instruction issue machines
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Proceedings of the 24th annual international symposium on Computer architecture
Micro-preemption synthesis: an enabling mechanism for multi-task VLSI systems
ICCAD '97 Proceedings of the 1997 IEEE/ACM international conference on Computer-aided design
On high-bandwidth data cache design for multi-issue processors
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Load latency tolerance in dynamically scheduled processors
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 1999 ACM symposium on Applied computing
Decoupling local variable accesses in a wide-issue superscalar processor
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Dynamic removal of redundant computations
ICS '99 Proceedings of the 13th international conference on Supercomputing
Access region locality for high-bandwidth processor memory system design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Design Alternatives of Multithreaded Architecture
International Journal of Parallel Programming
Push vs. pull: data movement for linked data structures
Proceedings of the 14th international conference on Supercomputing
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit
Proceedings of the 27th annual international symposium on Computer architecture
Clock rate versus IPC: the end of the road for conventional microarchitectures
Proceedings of the 27th annual international symposium on Computer architecture
Performance improvement with circuit-level speculation
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A time-stamping algorithm for efficient performance estimation of superscalar processors
Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Measuring experimental error in microprocessor simulation
SSR '01 Proceedings of the 2001 symposium on Software reusability: putting software reuse in context
A High-Bandwidth Memory Pipeline for Wide Issue Processors
IEEE Transactions on Computers
Designing a Modern Memory Hierarchy with Hardware Prefetching
IEEE Transactions on Computers
A large, fast instruction window for tolerating cache misses
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Adapting Tomasulo's algorithm for bytecode folding based Java processors
ACM SIGARCH Computer Architecture News - Special Issue: PACT 2001 workshops
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Measuring Experimental Error in Microprocessor Simulation
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
IEEE Micro
The PowerPC 604 RISC microprocessor
IEEE Micro
Limited Bandwidth to Affect Processor Design
IEEE Micro
Interrupt Handling for Out-of-Order Execution Processors
IEEE Transactions on Computers
Selective Register Renaming: A Compiler-Driven Approach to Dynamic Register Renaming
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Influence of Compiler Optimizations on Value Prediction
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
Augmenting Modern Superscalar Architectures with Configurable Extended Instructions
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Compiler-Directed Dynamic Frequency and Voltage Scheduling
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
Low-Cost Value Predictors Using Frequent Value Locality
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Decoupling Recovery Mechanism for Data Speculation from Dynamic Instruction Scheduling Structure
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Value Prediction as a Cost-Effective Solution to Improve Embedded Processors Performance
VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
DRAM-Page Based Prediction and Prefetching
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Partial Resolution in Data Value Predictors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Balancing Reuse Opportunities and Performance Gains with Subblock Value Reuse
IEEE Transactions on Computers
Constructive timing violation for improving energy efficiency
Compilers and operating systems for low power
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Synthesis and verification
Proceedings of the 31st annual international symposium on Computer architecture
A scalable, clustered SMT processor for digital signal processing
MEDEA '03 Proceedings of the 2003 workshop on MEmory performance: DEaling with Applications , systems and architecture
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Tolerating memory latency through push prefetching for pointer-intensive applications
ACM Transactions on Architecture and Code Optimization (TACO)
Improving Energy-Efficiency by Bypassing Trivial Computations
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 11 - Volume 12
Recent extensions to the SimpleScalar tool suite
ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Memory State Compressors for Giga-Scale Checkpoint/Restore
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Modeling out-of-order processors for WCET analysis
Real-Time Systems
BranchTap: improving performance with very few checkpoints through adaptive speculation control
Proceedings of the 20th annual international conference on Supercomputing
Speculative trivialization point advancing in high-performance processors
Journal of Systems Architecture: the EUROMICRO Journal
Microarchitecture configurations and floorplanning co-optimization
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
The Journal of Supercomputing
The performance of pollution control victim cache for embedded systems
Proceedings of the 21st annual symposium on Integrated circuits and system design
The impact of speculative execution on SMT processors
International Journal of Parallel Programming
Access region cache with register guided memory reference partitioning
Journal of Systems Architecture: the EUROMICRO Journal
On the design of a register queue based processor architecture (FaRM-rq)
ISPA'03 Proceedings of the 2003 international conference on Parallel and distributed processing and applications
Analysis of x86 ISA condition codes influence on superscalar execution
HiPC'07 Proceedings of the 14th international conference on High performance computing
Turbo-ROB: a low cost checkpoint/restore accelerator
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Comparing FPGA vs. custom cmos and the impact on processor microarchitecture
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Hi-index | 15.00 |
The problems of data dependency resolution and precise interrupt implementation in pipelined processors are combined. A design for a hardware mechanism that resolves dependencies dynamically and, at the same time, guarantees precise interrupts is presented. Simulation studies show that by resolving dependencies the proposed mechanism is able to obtain a significant speedup over a simple instruction issue mechanism as well as implement precise interrupts.