Design of the IBM RISC System/6000 floating-point execution unit
IBM Journal of Research and Development
Single instruction stream parallelism is greater than two
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Dynamic dependency analysis of ordinary programs
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving the accuracy of dynamic branch prediction using branch correlation
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Dynamic path-based branch correlation
Proceedings of the 28th annual international symposium on Microarchitecture
Control flow prediction with tree-like subgraphs for superscalar processors
Proceedings of the 28th annual international symposium on Microarchitecture
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Value locality and load value prediction
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 24th annual international symposium on Computer architecture
Effective Hardware-Based Data Prefetching for High-Performance Processors
IEEE Transactions on Computers
The potential of data value speculation to boost ILP
ICS '98 Proceedings of the 12th international conference on Supercomputing
Load execution latency reduction
ICS '98 Proceedings of the 12th international conference on Supercomputing
Speculative multithreaded processors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Modeling program predictability
Proceedings of the 25th annual international symposium on Computer architecture
Predictive techniques for aggressive load speculation
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Understanding the differences between value prediction and instruction reuse
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
An empirical analysis of instruction repetition
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Value speculation scheduling for high performance processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Correlated load-address predictors
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Value prediction in VLIW machines
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Storageless value prediction using prior register values
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Cyclic dependence based data reference prediction
ICS '99 Proceedings of the 13th international conference on Supercomputing
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
Classifying load and store instructions for memory renaming
ICS '99 Proceedings of the 13th international conference on Supercomputing
Improving branch predictors by correlating on data values
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Value prediction for speculative multithreaded architectures
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Limits of Data Value Predictability
International Journal of Parallel Programming
Table size reduction for data value predictors by exploiting narrow width values
Proceedings of the 14th international conference on Supercomputing
Extending Value Reuse to Basic Blocks with Compiler Support
IEEE Transactions on Computers
Predictor-directed stream buffers
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Optimizations Enabled by a Decoupled Front-End Architecture
IEEE Transactions on Computers
Focusing processor policies via critical-path prediction
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
On Table Bandwidth and Its Update Delay for Value Prediction on Wide-Issue ILP Processors
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Characterization of value locality in Java programs
Workload characterization of emerging computer applications
Static load classification for improving the value predictability of data-cache misses
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Latency and energy aware value prediction for high-frequency processors
ICS '02 Proceedings of the 16th international conference on Supercomputing
The predictability of load address
ACM SIGARCH Computer Architecture News
Exploiting speculative value reuse using value prediction
CRPIT '02 Proceedings of the seventh Asia-Pacific conference on Computer systems architecture
Direct load: dependence-linked dataflow resolution of load address and cache coordinate
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
A general compiler framework for speculative multithreading
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
A preactivating mechanism for a VT-CMOS cache using address prediction
Proceedings of the 2002 international symposium on Low power electronics and design
Neural methods for dynamic branch prediction
ACM Transactions on Computer Systems (TOCS)
IEEE Transactions on Computers
On Augmenting Trace Cache for High-Bandwidth Value Prediction
IEEE Transactions on Computers
A Decoupled Predictor-Directed Stream Prefetching Architecture
IEEE Transactions on Computers
Modeling Value Speculation: An Optimal Edge Selection Problem
IEEE Transactions on Computers
Putting Data Value Predictors to Work in Fine-Grain Parallel Processors
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Using Dataflow Based Contextfor Accurate Branch Prediction
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Exploiting Data Value Prediction in Compiler Based Thread Formation
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Influence of Compiler Optimizations on Value Prediction
HPCN Europe 2001 Proceedings of the 9th International Conference on High-Performance Computing and Networking
A Feasibility Study of Hierarchical Multithreading
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Low-Cost Value Predictors Using Frequent Value Locality
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Runtime Association of Software Prefetch Control to Memory Access Instructions (Research Note)
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Independent Hashing as Confidence Mechanism for Value Predictors in Microprocessors
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Implementation of Hybrid Context Based Value Predictors Using Value Sequence Classification
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Reducing Energy Consumption via Low-Cost Value Prediction
PATMOS '02 Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
Value Prediction as a Cost-Effective Solution to Improve Embedded Processors Performance
VECPAR '00 Selected Papers and Invited Talks from the 4th International Conference on Vector and Parallel Processing
Branch prediction techniques for low-power VLIW processors
Proceedings of the 13th ACM Great Lakes symposium on VLSI
Enhancing memory level parallelism via recovery-free value prediction
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Hybridizing and Coalescing Load Value Predictors
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
A Power Perspective of Value Speculation for Superscalar Microprocessors
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Partial Resolution in Data Value Predictors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Detecting global stride locality in value streams
Proceedings of the 30th annual international symposium on Computer architecture
Balancing Reuse Opportunities and Performance Gains with Subblock Value Reuse
IEEE Transactions on Computers
Address-free memory access based on program syntax correlation of loads and stores
IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Thread Partitioning and Value Prediction for Exploiting Speculative Thread-Level Parallelism
IEEE Transactions on Computers
VPC3: a fast and effective trace-compression algorithm
Proceedings of the joint international conference on Measurement and modeling of computer systems
Scaling the issue window with look-ahead latency prediction
Proceedings of the 18th annual international conference on Supercomputing
Microarchitecture Optimizations for Exploiting Memory-Level Parallelism
Proceedings of the 31st annual international symposium on Computer architecture
An Efficient Value Predictor Dynamically Using Loop and Locality Properties
The Journal of Supercomputing
Automatic Generation of High-Performance Trace Compressors
Proceedings of the international symposium on Code generation and optimization
On the energy-efficiency of speculative hardware
Proceedings of the 2nd conference on Computing frontiers
Enhancing Memory-Level Parallelism via Recovery-Free Value Prediction
IEEE Transactions on Computers
Improving the Performance of Software Distributed Shared Memory with Speculation
IEEE Transactions on Parallel and Distributed Systems
An Event-Driven Multithreaded Dynamic Optimization Framework
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
The VPC Trace-Compression Algorithms
IEEE Transactions on Computers
Reducing the Latency and Area Cost of Core Swapping through Shared Helper Engines
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Dynamically configurable shared CMP helper engines for improved performance
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Improving memory system performance with energy-efficient value speculation
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Revised Stride Data Value Predictor Design
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
TCgen 2.0: a tool to automatically generate lossless trace compressors
ACM SIGARCH Computer Architecture News
Improving the performance and power efficiency of shared helpers in CMPs
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Speculative trivialization point advancing in high-performance processors
Journal of Systems Architecture: the EUROMICRO Journal
Improving instruction level parallelism through reconfigurable units in superscalar processors
ACM SIGARCH Computer Architecture News - Special issue on the 2006 reconfigurable and adaptive architecture workshop
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Compiler and hardware support for reducing the synchronization of speculative threads
ACM Transactions on Architecture and Code Optimization (TACO)
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
Proceedings of the 6th ACM conference on Computing frontiers
Impact analysis of performance faults in modern microprocessors
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Speculative parallelization using state separation and multiple value prediction
Proceedings of the 2010 international symposium on Memory management
Limits for a feasible speculative trace reuse implementation
International Journal of High Performance Systems Architecture
The potential of using dynamic information flow analysis in data value prediction
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Data value prefetching method based on Markov model
ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
Improving performance through deep value profiling and specialization with code transformation
Computer Languages, Systems and Structures
Neural confidence estimation for more accurate value prediction
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction
ACM Transactions on Architecture and Code Optimization (TACO)
Making power-efficient data value predictions
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Exploiting thread-level speculative parallelism with software value prediction
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Low-overhead core swapping for thermal management
PACS'04 Proceedings of the 4th international conference on Power-Aware Computer Systems
Memory Latency Hiding by Load Value Speculation for Reconfigurable Computers
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
On the Impact of Performance Faults in Modern Microprocessors
Journal of Electronic Testing: Theory and Applications
Hi-index | 0.03 |
Data dependences (data flow constraints) present a major hurdle to the amount of instruction-level parallelism that can be exploited from a program. Recent work has suggested that the limits imposed by data dependences can be overcome to some extent with the use of data value prediction. That is, when an instruction is fetched, its result can be predicted so that subsequent instructions that depend on the result can use this predicted value. When the correct result becomes available, all instructions that are data dependent on that prediction can be validated. This paper investigates a variety of techniques to carry out highly accurate data value predictions. The first technique investigates the potential of monitoring the strides by which the results produced by different instances of an instruction change. The second technique investigates the potential of pattern-based two-level prediction schemes. Simulation results of these two schemes show improvements over the existing method of predicting the last outcome. In particular, some benchmarks show improvement with the stride-based predictor and others show improvement with the pattern-based predictor. To do uniformly well across benchmarks, we combine these two predictors to form a hybrid predictor. Simulation analysis of the hybrid predictor shows its overall prediction accuracy to be better than that of the component predictors across all benchmarks.