Exploiting Value Locality to Exceed the Dataflow Limit

Authors:
Mikko H. Lipasti;John Paul Shen
Affiliations:
-;-
Venue:
International Journal of Parallel Programming
Year:
1998

Citing 24
Cited 0

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
Limits of instruction-level parallelism

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Two-level adaptive training branch prediction

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Data access microarchitectures for superscalar processors with compiler-assisted data prefetching

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Limits of control flow on parallelism

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Link-time optimization of address calculation on a 64-bit architecture

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Speculative disambiguation: a compilation technique for dynamic memory disambiguation

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A performance study of software and hardware data prefetching schemes

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The multiscalar architecture

The multiscalar architecture
Dynamic memory disambiguation using the memory conflict buffer

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Performance evaluation of the PowerPC 620 microarchitecture

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Zero-cycle loads: microarchitecture support for reducing load latency

Proceedings of the 28th annual international symposium on Microarchitecture
SPAID: software prefetching in pointer- and call-intensive environments

Proceedings of the 28th annual international symposium on Microarchitecture
Value locality and load value prediction

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Predictability of load/store instruction latencies

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
The effect of instruction fetch bandwidth on value prediction

Proceedings of the 25th annual international symposium on Computer architecture
Value locality and speculative execution

Value locality and speculative execution
VMW: A Visualization-Based Microarchitecture Workbench

Computer
The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor

COMPCON '95 Proceedings of the 40th IEEE Computer Society International Conference
An architectural alternative to optimizing compilers

ASPLOS I Proceedings of the first international symposium on Architectural support for programming languages and operating systems
A study of branch prediction strategies

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
A computer architecture for the dynamic optimization of high-level language programs

A computer architecture for the dynamic optimization of high-level language programs
Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation

Caching Function Results: Faster Arithmetic by Avoiding Unnecessary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

The serialization constraints imposed by true data dependences have always been regarded as an absolute dataflow limit on the parallel execution of serial programs. This paper describes value prediction, a new technique that allows data dependent instructions to issue and execute in parallel without violating program semantics. This technique exploits value locality, or the likelihood of the recurrence of a previously-seen value within a storage location inside a computer system. Value prediction consists of predicting entire 32- and 64-bit register values based on previously-seen values. We find that values loaded from memory or generated by ALU instructions are frequently predictable. Furthermore, we show that simple microarchitectural enhancements to a modern microprocessor implementation based on the PowerPC 620 that enable value prediction can effectively exploit value locality to collapse true dependences, reduce average memory and result latencies, and provide average performance gains of 3%-23% by exceeding the dataflow limit.