Value locality and load value prediction

Authors:
Mikko H. Lipasti;Christopher B. Wilkerson;John Paul Shen
Affiliations:
Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh PA;Intel Corporation in Portland, Oregon and Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh PA;Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh PA
Venue:
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Year:
1996

Citing 24
Cited 175

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Software prefetching

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Two-level adaptive training branch prediction

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Data access microarchitectures for superscalar processors with compiler-assisted data prefetching

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Link-time optimization of address calculation on a 64-bit architecture

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
ATOM: a system for building customized program analysis tools

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
A performance study of software and hardware data prefetching schemes

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Compiler optimizations for improving data locality

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Performance evaluation of the PowerPC 620 microarchitecture

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Zero-cycle loads: microarchitecture support for reducing load latency

Proceedings of the 28th annual international symposium on Microarchitecture
A modified approach to data cache management

Proceedings of the 28th annual international symposium on Microarchitecture
Predictability of load/store instruction latencies

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Cache Memories

ACM Computing Surveys (CSUR)
VMW: A Visualization-Based Microarchitecture Workbench

Computer
Predicting and Precluding Problems with Memory Latency

IEEE Micro
Internal architecture of Alpha 21164 microprocessor

COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor

COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
An architectural alternative to optimizing compilers

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A computer architecture for the dynamic optimization of high-level language programs

A computer architecture for the dynamic optimization of high-level language programs
Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation

Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation

Exceeding the dataflow limit via value prediction

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
The performance potential of data dependence speculation & collapsing

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Low power data processing by elimination of redundant computations

ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Speculative execution via address prediction and data prefetching

ICS '97 Proceedings of the 11th international conference on Supercomputing
Dynamic instruction reuse

Proceedings of the 24th annual international symposium on Computer architecture
A language for describing predictors and its application to automatic synthesis

Proceedings of the 24th annual international symposium on Computer architecture
Improving the accuracy and performance of memory communication through renaming

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Microarchitecture support for improving the performance of load target prediction

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Streamlining inter-operation memory communication via data dependence prediction

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The predictability of data values

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Value profiling

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Can program profiling support value prediction?

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Highly accurate data value prediction using hybrid predictors

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
The potential of data value speculation to boost ILP

ICS '98 Proceedings of the 12th international conference on Supercomputing
Load execution latency reduction

ICS '98 Proceedings of the 12th international conference on Supercomputing
Speculative multithreaded processors

ICS '98 Proceedings of the 12th international conference on Supercomputing
Speculative execution model with duplication

ICS '98 Proceedings of the 12th international conference on Supercomputing
The effect of instruction fetch bandwidth on value prediction

Proceedings of the 25th annual international symposium on Computer architecture
Modeling program predictability

Proceedings of the 25th annual international symposium on Computer architecture
Using value prediction to increase the power of speculative execution hardware

ACM Transactions on Computer Systems (TOCS)
Predictive techniques for aggressive load speculation

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Understanding the differences between value prediction and instruction reuse

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A dynamic multithreading processor

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
An empirical analysis of instruction repetition

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Value speculation scheduling for high performance processors

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Correlated load-address predictors

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Selective value prediction

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Value prediction in VLIW machines

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Storageless value prediction using prior register values

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Cyclic dependence based data reference prediction

ICS '99 Proceedings of the 13th international conference on Supercomputing
Classifying load and store instructions for memory renaming

ICS '99 Proceedings of the 13th international conference on Supercomputing
Access region locality for high-bandwidth processor memory system design

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Compiler-directed dynamic computation reuse: rationale and initial results

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Dynamic memory disambiguation in the presence of out-of-order store issuing

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Read-after-read memory dependence prediction

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Value prediction for speculative multithreaded architectures

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Memory Renaming: Fast, Early and Accurate Processing of Memory Communication

International Journal of Parallel Programming
Limits of Data Value Predictability

International Journal of Parallel Programming
Table size reduction for data value predictors by exploiting narrow width values

Proceedings of the 14th international conference on Supercomputing
Extending Value Reuse to Basic Blocks with Compiler Support

IEEE Transactions on Computers
Reconfigurable caches and their application to media processing

Proceedings of the 27th annual international symposium on Computer architecture
Early load address resolution via register tracking

Proceedings of the 27th annual international symposium on Computer architecture
Speculative Memory Cloaking and Bypassing

International Journal of Parallel Programming - Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II
Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance

ACM Transactions on Computer Systems (TOCS)
Automated data-member layout of heap objects to improve memory-hierarchy performance

ACM Transactions on Programming Languages and Systems (TOPLAS)
Frequent value locality and value-centric data cache design

ACM SIGPLAN Notices
Hardware support for dynamic activation of compiler-directed computation reuse

ACM SIGPLAN Notices
Slipstream processors: improving both performance and fault tolerance

ACM SIGPLAN Notices
Calpa: a tool for automating selective dynamic compilation

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Architecture of the Atlas Chip-Multiprocessor: Dynamically Parallelizing Irregular Applications

IEEE Transactions on Computers
Optimizations Enabled by a Decoupled Front-End Architecture

IEEE Transactions on Computers
Load and store reuse using register file contents

ICS '01 Proceedings of the 15th international conference on Supercomputing
Frequent value locality and value-centric data cache design

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Hardware support for dynamic activation of compiler-directed computation reuse

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Slipstream processors: improving both performance and fault tolerance

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Better exploration of region-level value locality with integrated computation reuse and value prediction

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
A High-Bandwidth Memory Pipeline for Wide Issue Processors

IEEE Transactions on Computers
Reducing Memory Latency via Read-after-Read Memory Dependence Prediction

IEEE Transactions on Computers
Silent Stores and Store Value Locality

IEEE Transactions on Computers
Characterization of value locality in Java programs

Workload characterization of emerging computer applications
Static load classification for improving the value predictability of data-cache misses

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Latency and energy aware value prediction for high-frequency processors

ICS '02 Proceedings of the 16th international conference on Supercomputing
Difficult-path branch prediction using subordinate microthreads

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The predictability of load address

ACM SIGARCH Computer Architecture News
Exploiting speculative value reuse using value prediction

CRPIT '02 Proceedings of the seventh Asia-Pacific conference on Computer systems architecture
Direct load: dependence-linked dataflow resolution of load address and cache coordinate

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Frequent value locality and its applications

ACM Transactions on Embedded Computing Systems (TECS)
Exploiting Value Locality to Exceed the Dataflow Limit

International Journal of Parallel Programming
Superspeculative Microarchitecture for Beyond AD 2000

Computer
Calibration of Microprocessor Performance Models

Computer
An Integrated Functional Performance Simulator

IEEE Micro
Hybrid Load-Value Predictors

IEEE Transactions on Computers
A survey of processors with explicit multithreading

ACM Computing Surveys (CSUR)
Modeling Value Speculation: An Optimal Edge Selection Problem

IEEE Transactions on Computers
Low-Cost Value Predictors Using Frequent Value Locality

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Design Considerations of High Performance Data Cache with Prefetching

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Decoupling Recovery Mechanism for Data Speculation from Dynamic Instruction Scheduling Structure

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Multi-stage Cascaded Prediction

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A Prolog Tailoring Technique on an Epilog Tailored Procedure

PSI '02 Revised Papers from the 4th International Andrei Ershov Memorial Conference on Perspectives of System Informatics: Akademgorodok, Novosibirsk, Russia
Applying Machine Learning for Ensemble Branch Predictors

IEA/AIE '02 Proceedings of the 15th international conference on Industrial and engineering applications of artificial intelligence and expert systems: developments in applied artificial intelligence
Reducing Energy Consumption via Low-Cost Value Prediction

PATMOS '02 Proceedings of the 12th International Workshop on Integrated Circuit Design. Power and Timing Modeling, Optimization and Simulation
Microprocessors - 10 Years Back, 10 Years Ahead

Informatics - 10 Years Back. 10 Years Ahead.
Two-Level Address Storage and Address Prediction (Research Note)

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
An efficient static analysis algorithm to detect redundant memory operations

Proceedings of the 2002 workshop on Memory system performance
Exploiting data-width locality to increase superscalar execution bandwidth

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Toward a decidable notion of sequential consistency

Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Enhancing memory level parallelism via recovery-free value prediction

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Hybridizing and Coalescing Load Value Predictors

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Partial Resolution in Data Value Predictors

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Load Redundancy Removal through Instruction Reuse

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Detecting global stride locality in value streams

Proceedings of the 30th annual international symposium on Computer architecture
Phase tracking and prediction

Proceedings of the 30th annual international symposium on Computer architecture
Balancing Reuse Opportunities and Performance Gains with Subblock Value Reuse

IEEE Transactions on Computers
Address-free memory access based on program syntax correlation of loads and stores

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Value reuse optimization: reuse of evaluated math library function calls through compiler generated cache

ACM SIGPLAN Notices
Modeling technology impact on cluster microprocessor performance

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on low power
Thread Partitioning and Value Prediction for Exploiting Speculative Thread-Level Parallelism

IEEE Transactions on Computers
Constructive timing violation for improving energy efficiency

Compilers and operating systems for low power
VPC3: a fast and effective trace-compression algorithm

Proceedings of the joint international conference on Measurement and modeling of computer systems
Scaling the issue window with look-ahead latency prediction

Proceedings of the 18th annual international conference on Supercomputing
Cluster prefetch: tolerating on-chip wire delays in clustered microarchitectures

Proceedings of the 18th annual international conference on Supercomputing
Microarchitecture Optimizations for Exploiting Memory-Level Parallelism

Proceedings of the 31st annual international symposium on Computer architecture
On the effectiveness of flow aggregation in improving instruction reuse in network processing applications

International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
Implementing branch-predictor decay using quasi-static memory cells

ACM Transactions on Architecture and Code Optimization (TACO)
Speculative Incoherent Cache Protocols

IEEE Micro
Automatic Generation of High-Performance Trace Compressors

Proceedings of the international symposium on Code generation and optimization
Reactive Techniques for Controlling Software Speculation

Proceedings of the international symposium on Code generation and optimization
On the energy-efficiency of speculative hardware

Proceedings of the 2nd conference on Computing frontiers
High Efficiency Counter Mode Security Architecture via Prediction and Precomputation

Proceedings of the 32nd annual international symposium on Computer Architecture
Enhancing Memory-Level Parallelism via Recovery-Free Value Prediction

IEEE Transactions on Computers
Reducing latencies of pipelined cache accesses through set prediction

Proceedings of the 19th annual international conference on Supercomputing
Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
The VPC Trace-Compression Algorithms

IEEE Transactions on Computers
Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Improving memory system performance with energy-efficient value speculation

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Spectral prefetcher: An effective mechanism for L2 cache prefetching

ACM Transactions on Architecture and Code Optimization (TACO)
CAVA: Using checkpoint-assisted value prediction to hide L2 misses

ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic feature selection for hardware prediction

Journal of Systems Architecture: the EUROMICRO Journal
Using the first-level caches as filters to reduce the pollution caused by speculative memory references

International Journal of Parallel Programming
Efficient emulation of hardware prefetchers via event-driven helper threading

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
TCgen 2.0: a tool to automatically generate lossless trace compressors

ACM SIGARCH Computer Architecture News
Enabling real-time physics simulation in future interactive entertainment

Proceedings of the 2006 ACM SIGGRAPH symposium on Videogames
On the performance potential of different types of speculative thread-level parallelism: The DL version of this paper includes corrections that were not made available in the printed proceedings

Proceedings of the 20th annual international conference on Supercomputing
Future execution: A prefetching mechanism that uses multiple cores to speed up single threads

ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic reuse of subroutine results

Journal of Systems Architecture: the EUROMICRO Journal
A comparison of two policies for issuing instructions speculatively

Journal of Systems Architecture: the EUROMICRO Journal
I-cache multi-banking and vertical interleaving

Proceedings of the 17th ACM Great Lakes symposium on VLSI
Accelerating memory decryption and authentication with frequent value prediction

Proceedings of the 4th international conference on Computing frontiers
Adaptive VP decay: making value predictors leakage-efficient designs for high performance processors

Proceedings of the 4th international conference on Computing frontiers
Speculative trivialization point advancing in high-performance processors

Journal of Systems Architecture: the EUROMICRO Journal
Limiting the power consumption of main memory

Proceedings of the 34th annual international symposium on Computer architecture
Working with process variation aware caches

Proceedings of the conference on Design, automation and test in Europe
Data prefetching and address pre-calculation through instruction pre-execution with two-step physical register deallocation

MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Dynamic analysis of Java program concepts for visualization and profiling

Science of Computer Programming
SoftSig: software-exposed hardware signatures for code analysis and optimization

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Predictor virtualization

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Dispersing proprietary applications as benchmarks through code mutation

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Adaptive prefetching algorithm in disk controllers

Performance Evaluation
Spice: speculative parallel iteration chunk execution

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Block remap with turnoff: a variation-tolerant cache design technique

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
Early detection and bypassing of trivial operations to improve energy efficiency of processors

Microprocessors & Microsystems
Zero loads: canceling load requests by tracking zero values

Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture
Exploiting selective instruction reuse and value prediction in a superscalar architecture

Journal of Systems Architecture: the EUROMICRO Journal
Adaptive data compression for high-performance low-power on-chip networks

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Read Validation in Time-Based Software Transactional Memory

Euro-Par 2008 Workshops - Parallel Processing
Speculative parallelization of multipath radiosity algorithm

SPECTS'09 Proceedings of the 12th international conference on Symposium on Performance Evaluation of Computer & Telecommunication Systems
Safe programmable speculative parallelism

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Speculative parallelization using state separation and multiple value prediction

Proceedings of the 2010 international symposium on Memory management
The potential of using dynamic information flow analysis in data value prediction

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Energy-performance design space exploration in SMT architectures exploiting selective load value predictions

Proceedings of the Conference on Design, Automation and Test in Europe
Window memoization: an efficient hardware architecture for high-performance image processing

Journal of Real-Time Image Processing
Leakage-efficient design of value predictors through state and non-state preserving techniques

The Journal of Supercomputing
Data value prefetching method based on Markov model

ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches

ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic access distance driven cache replacement

ACM Transactions on Architecture and Code Optimization (TACO)
Improving performance through deep value profiling and specialization with code transformation

Computer Languages, Systems and Structures
Dynamic dictionary-based data compression for level-1 caches

ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems
Neural confidence estimation for more accurate value prediction

HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Utilizing dynamic data value localities in internal variables

PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction

ACM Transactions on Architecture and Code Optimization (TACO)
Limits of parallelism using dynamic dependency graphs

WODA '09 Proceedings of the Seventh International Workshop on Dynamic Analysis
Identifying and predicting timing-critical instructions to boost timing speculation

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Exploring the potential of architecture-level power optimizations

PACS'03 Proceedings of the Third international conference on Power - Aware Computer Systems
Exploiting thread-level speculative parallelism with software value prediction

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Targeted data prefetching

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Improving energy efficiency via speculative multithreading on multicore processors

PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
A theory of speculative computation

ESOP'10 Proceedings of the 19th European conference on Programming Languages and Systems
Distributed replay protocol for distributed uniprocessors

Proceedings of the 26th ACM international conference on Supercomputing
Memory Latency Hiding by Load Value Speculation for Reconfigurable Computers

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A bypass mechanism to enhance branch predictor for SMT processors

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
A cost-based database request distribution technique for online e-commerce applications

MIS Quarterly
MELOADES: Methodology for long-term online adaptation of embedded software for heterogeneous devices

Journal of Systems Architecture: the EUROMICRO Journal
ASC: automatically scalable computation

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.04

Visualization

Abstract

Since the introduction of virtual memory demand-paging and cache memories, computer systems have been exploiting spatial and temporal locality to reduce the average latency of a memory reference. In this paper, we introduce the notion of value locality, a third facet of locality that is frequently present in real-world programs, and describe how to effectively capture and exploit it in order to perform load value prediction. Temporal and spatial locality are attributes of storage locations, and describe the future likelihood of references to those locations or their close neighbors. In a similar vein, value locality describes the likelihood of the recurrence of a previously-seen value within a storage location. Modern processors already exploit value locality in a very restricted sense through the use of control speculation (i.e. branch prediction), which seeks to predict the future value of a single condition bit based on previously-seen values. Our work extends this to predict entire 32- and 64-bit register values based on previously-seen values. We find that, just as condition bits are fairly predictable on a per-static-branch basis, full register values being loaded from memory are frequently predictable as well. Furthermore, we show that simple microarchitectural enhancements to two modern microprocessor implementations (based on the PowerPC 620 and Alpha 21164) that enable load value prediction can effectively exploit value locality to collapse true dependencies, reduce average memory latency and bandwidth requirements, and provide measurable performance gains.