Dynamic memory disambiguation using the memory conflict buffer

Authors:
David M. Gallagher;William Y. Chen;Scott A. Mahlke;John C. Gyllenhaal;Wen-mei W. Hwu
Affiliations:
Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL;Intel Corporation, Santa Clara, CA;Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL;Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL;Center for Reliable and High-Performance Computing, University of Illinois, Urbana-Champaign, IL
Venue:
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Year:
1994

Citing 21
Cited 50

A VLIW architecture for a trace scheduling compiler

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
CRegs: a new kind of memory for referencing arrays and pointers

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Run-time disambiguation: coping with statically unpredictable dependencies

IEEE Transactions on Computers
Architectural support for register allocation in the presence of aliasing

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Efficient and exact data dependence analysis

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Practical dependence testing

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors

MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Eliminating false data dependences using the Omega test

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
A safe approximate algorithm for interprocedural aliasing

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Tolerating data access latency with register preloading

ICS '92 Proceedings of the 6th international conference on Supercomputing
Sentinel scheduling for VLIW and superscalar processors

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Efficient superscalar performance through boosting

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A practical data flow framework for array reference analysis and its use in optimizations

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation

The Journal of Supercomputing - Special issue on instruction-level parallelism
Interprocedural may-alias analysis for pointers: beyond k-limiting

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Context-sensitive interprocedural points-to analysis in the presence of function pointers

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Speculative disambiguation: a compilation technique for dynamic memory disambiguation

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Data preload for superscalar and VLIW processors

Data preload for superscalar and VLIW processors
Speculative execution exception recovery using write-back suppression

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Dependence Analysis for Supercomputing

Dependence Analysis for Supercomputing

Exceeding the dataflow limit via value prediction

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Dynamic speculation and synchronization of data dependences

Proceedings of the 24th annual international symposium on Computer architecture
Value profiling

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture

Proceedings of the 25th annual international symposium on Computer architecture
A novel renaming scheme to exploit value temporal locality through physical register reuse and unification

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Value speculation scheduling for high performance processors

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Speculation techniques for improving load related instruction scheduling

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Classifying load and store instructions for memory renaming

ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler-directed dynamic computation reuse: rationale and initial results

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
OS and compiler considerations in the design of the IA-64 architecture

ACM SIGPLAN Notices
The store-load address table and speculative register promotion

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
OS and compiler considerations in the design of the IA-64 architecture

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Rapid profiling via stratified sampling

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Compiler Support for Scalable and Efficient Memory Systems

IEEE Transactions on Computers
Cost effective memory disambiguation for multimedia codes

CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler optimization of scalar value communication between speculative threads

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Optimization of Machine Descriptions for Efficient Use

International Journal of Parallel Programming
Exploiting Value Locality to Exceed the Dataflow Limit

International Journal of Parallel Programming
Changing Interaction of Compiler and Architecture

Computer
Compilers for Instruction-Level Parallelism

Computer
Introducing the IA-64 Architecture

IEEE Micro
Comprehensive Redundant Load Elimination for the IA-64 Architecture

LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Speculative Alias Analysis for Executable Code

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Limits of Task-Based Parallelism in Irregular Applications

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Eliminating Exception Constraints of Java Programs for IA-64

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Master/slave speculative parallelization

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Compiler managed micro-cache bypassing for high performance EPIC processors

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
The Transmeta Code Morphing™ Software: using speculation, recovery, and adaptive retranslation to address real-life challenges

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Speculative register promotion using Advanced Load Address Table (ALAT)

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Memory disambiguation for general-purpose applications

CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
A new speculation technique to optimize floating-point performance while preserving bit-by-bit reproducibility

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Scalable Hardware Memory Disambiguation for High ILP Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Beating in-order stalls with "flea-flicker" two-pass pipelining

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Field-testing IMPACT EPIC research results in Itanium 2

Proceedings of the 31st annual international symposium on Computer architecture
Scalable selective re-execution for EDGE architectures

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Compiler orchestrated prefetching via speculation and predication

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion

Proceedings of the international symposium on Code generation and optimization
Store Vulnerability Window (SVW): Re-Execution Filtering for Enhanced Load Optimization

Proceedings of the 32nd annual international symposium on Computer Architecture
Dynamic memory interval test vs. interprocedural pointer analysis in multimedia applications

ACM Transactions on Architecture and Code Optimization (TACO)
Address-Indexed Memory Disambiguation and Store-to-Load Forwarding

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining

IEEE Transactions on Computers
SoftSig: software-exposed hardware signatures for code analysis and optimization

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Compiler and hardware support for reducing the synchronization of speculative threads

ACM Transactions on Architecture and Code Optimization (TACO)
Simultaneous speculative threading: a novel pipeline architecture implemented in sun's rock processor

Proceedings of the 36th annual international symposium on Computer architecture
A case for an SC-preserving compiler

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
OUTRIDER: efficient memory latency tolerance with decoupled strands

Proceedings of the 38th annual international symposium on Computer architecture
Trimaran: an infrastructure for research in instruction-level parallelism

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Static analysis and compiler design for idempotent processing

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
SMARQ: Software-Managed Alias Register Queue for Dynamic Optimizations

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Runtime dependency analysis for loop pipelining in high-level synthesis

Proceedings of the 50th Annual Design Automation Conference

Quantified Score

Hi-index	0.01

Visualization

Abstract

To exploit instruction level parallelism, compilers for VLIW and superscalar processors often employ static code scheduling. However, the available code reordering may be severely restricted due to ambiguous dependences between memory instructions. This paper introduces a simple hardware mechanism, referred to as the memory conflict buffer, which facilitates static code scheduling in the presence of memory store/load dependences. Correct program execution is ensured by the memory conflict buffer and repair code provided by the compiler. With this addition, significant speedup over an aggressive code scheduling model can be achieved for both non-numerical and numerical programs.