Optimization of instruction fetch mechanisms for high issue rates
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The case for a single-chip multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The intrinsic bandwidth requirements of ordinary programs
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Eliminating operand read latency
ACM SIGARCH Computer Architecture News
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences
Proceedings of the 24th annual international symposium on Computer architecture
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
Value locality and speculative execution
Value locality and speculative execution
The Performance Potential of Value and Dependence Prediction
Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Pipeline gating: speculation control for energy reduction
Proceedings of the 25th annual international symposium on Computer architecture
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Decoupling local variable accesses in a wide-issue superscalar processor
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Access region locality for high-bandwidth processor memory system design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Limits of Data Value Predictability
International Journal of Parallel Programming
Extending Value Reuse to Basic Blocks with Compiler Support
IEEE Transactions on Computers
A High-Bandwidth Memory Pipeline for Wide Issue Processors
IEEE Transactions on Computers
Realizing High IPC Using Time-Tagged Resource-Flow Computing
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Realizing high IPC through a scalable memory-latency tolerant multipath microarchitecture
ACM SIGARCH Computer Architecture News
Multiple-path execution for chip multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Supporting microthread scheduling and synchronisation in CMPs
International Journal of Parallel Programming
Access region cache with register guided memory reference partitioning
Journal of Systems Architecture: the EUROMICRO Journal
Effect of increasing chip density on the evolution of computer architectures
IBM Journal of Research and Development
Dynamic partition of memory reference instructions – a register guided approach
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Hi-index | 4.11 |
Based on their research at Carnegie Mellon University, these authors also argue for billion-transistor uniprocessors. Like Patt et al., they divide the important implementation problems into three components: instruction flow, register dataflow, and memory dataflow. They also argue for trace caches and advanced branch prediction. Their article, however, focuses on using massive speculation at all levels to improve performance. They claim that without this much speculation, future processors will be limited by true data dependences, and will be unable to harvest enough instruction-level parallelism (ILP) to improve performance satisfactorily. Their investigations discovered large speedups on code that have traditionally not been amenable to finding ILP.