Dynamic memory disambiguation in the presence of out-of-order store issuing

Authors:
Soner Onder;Rajiv Gupta
Affiliations:
Department of Computer Science, Michigan Technological University, Houghton, MI;Department of Computer Science, The University of Arizona, Tucson, AZ
Venue:
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Year:
1999

Citing 9
Cited 12

Value locality and load value prediction

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Exceeding the dataflow limit via value prediction

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Dynamic speculation and synchronization of data dependences

Proceedings of the 24th annual international symposium on Computer architecture
Memory dependence prediction using store sets

Proceedings of the 25th annual international symposium on Computer architecture
Predictive techniques for aggressive load speculation

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
A novel renaming scheme to exploit value temporal locality through physical register reuse and unification

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Selective value prediction

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Automatic Generation of Microarchitecture Simulators

ICCL '98 Proceedings of the 1998 International Conference on Computer Languages
Memory dependence prediction

Memory dependence prediction

Load and store reuse using register file contents

ICS '01 Proceedings of the 15th international conference on Supercomputing
Cost Effective Memory Dependence Prediction using Speculation Levels and Color Sets

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Instruction Wake-Up in Wide Issue Superscalars

Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Memory Ordering: A Value-Based Approach

Proceedings of the 31st annual international symposium on Computer architecture
Store Vulnerability Window (SVW): Re-Execution Filtering for Enhanced Load Optimization

Proceedings of the 32nd annual international symposium on Computer Architecture
Fast branch misprediction recovery in out-of-order superscalar processors

Proceedings of the 19th annual international conference on Supercomputing
Scalable Store-Load Forwarding via Store Queue Index Prediction

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Address-Indexed Memory Disambiguation and Store-to-Load Forwarding

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
"Flea-flicker" Multipass Pipelining: An Alternative to the High-Power Out-of-Order Offense

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Improving single-thread performance with fine-grain state maintenance

Proceedings of the 5th conference on Computing frontiers
Federation: Boosting per-thread performance of throughput-oriented manycore architectures

ACM Transactions on Architecture and Code Optimization (TACO)
SAMIE-LSQ: set-associative multiple-instruction entry load/store queue

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the help of the memory dependence predictor the instruction scheduler can speculatively issue load instructions at the earliest possible time without causing significant amounts of memory order violations. For maximum performance, the scheduler must also allow full out-of-order issuing of store instructions since any superfluous ordering of stores results in false memory dependencies which adversely affect the timely issuing of dependent loads. Unfortunately, simple techniques of detecting memory order violations do not work well when store instructions issue out-of-order since they yield many false memory order violations. By using a novel memory order violation detection mechanism that is employed in the retire logic of the processor and delaying the checking for memory order violations, we are able to allow full out-of-order issuing of store instructions without causing false memory order violations. In addition, our mechanism can take advantage of data value redundancy. We present an implementation of our technique using the store set memory dependence predictor. An out-of-order superscalar processor that uses our technique delivers an IPC which is within 100, 96 and 85 % of a processor equipped with an ideal memory disambiguator at issue widths of 8, 16 and 32 instructions respectively.