A VLIW architecture for a trace scheduling compiler
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
CRegs: a new kind of memory for referencing arrays and pointers
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Run-time disambiguation: coping with statically unpredictable dependencies
IEEE Transactions on Computers
Architectural support for register allocation in the presence of aliasing
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Efficient and exact data dependence analysis
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Pseudo-randomly interleaved memory
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparing static and dynamic code scheduling for multiple-instruction-issue processors
MICRO 24 Proceedings of the 24th annual international symposium on Microarchitecture
Eliminating false data dependences using the Omega test
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
A safe approximate algorithm for interprocedural aliasing
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Tolerating data access latency with register preloading
ICS '92 Proceedings of the 6th international conference on Supercomputing
Sentinel scheduling for VLIW and superscalar processors
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Efficient superscalar performance through boosting
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A practical data flow framework for array reference analysis and its use in optimizations
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The superblock: an effective technique for VLIW and superscalar compilation
The Journal of Supercomputing - Special issue on instruction-level parallelism
Interprocedural may-alias analysis for pointers: beyond k-limiting
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Context-sensitive interprocedural points-to analysis in the presence of function pointers
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Speculative disambiguation: a compilation technique for dynamic memory disambiguation
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Data preload for superscalar and VLIW processors
Data preload for superscalar and VLIW processors
Speculative execution exception recovery using write-back suppression
MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Exceeding the dataflow limit via value prediction
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Integrated predicated and speculative execution in the IMPACT EPIC architecture
Proceedings of the 25th annual international symposium on Computer architecture
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Value speculation scheduling for high performance processors
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Speculation techniques for improving load related instruction scheduling
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Classifying load and store instructions for memory renaming
ICS '99 Proceedings of the 13th international conference on Supercomputing
Compiler-directed dynamic computation reuse: rationale and initial results
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
OS and compiler considerations in the design of the IA-64 architecture
ACM SIGPLAN Notices
The store-load address table and speculative register promotion
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
OS and compiler considerations in the design of the IA-64 architecture
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Rapid profiling via stratified sampling
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Compiler Support for Scalable and Efficient Memory Systems
IEEE Transactions on Computers
Cost effective memory disambiguation for multimedia codes
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler optimization of scalar value communication between speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Optimization of Machine Descriptions for Efficient Use
International Journal of Parallel Programming
Exploiting Value Locality to Exceed the Dataflow Limit
International Journal of Parallel Programming
Introducing the IA-64 Architecture
IEEE Micro
Comprehensive Redundant Load Elimination for the IA-64 Architecture
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
Speculative Alias Analysis for Executable Code
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Limits of Task-Based Parallelism in Irregular Applications
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Eliminating Exception Constraints of Java Programs for IA-64
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Master/slave speculative parallelization
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Compiler managed micro-cache bypassing for high performance EPIC processors
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Speculative register promotion using Advanced Load Address Table (ALAT)
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Memory disambiguation for general-purpose applications
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Scalable Hardware Memory Disambiguation for High ILP Processors
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Beating in-order stalls with "flea-flicker" two-pass pipelining
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Field-testing IMPACT EPIC research results in Itanium 2
Proceedings of the 31st annual international symposium on Computer architecture
Scalable selective re-execution for EDGE architectures
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Compiler orchestrated prefetching via speculation and predication
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
A General Compiler Framework for Speculative Optimizations Using Data Speculative Code Motion
Proceedings of the international symposium on Code generation and optimization
Store Vulnerability Window (SVW): Re-Execution Filtering for Enhanced Load Optimization
Proceedings of the 32nd annual international symposium on Computer Architecture
Dynamic memory interval test vs. interprocedural pointer analysis in multimedia applications
ACM Transactions on Architecture and Code Optimization (TACO)
Address-Indexed Memory Disambiguation and Store-to-Load Forwarding
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Beating In-Order Stalls with "Flea-Flicker" Two-Pass Pipelining
IEEE Transactions on Computers
SoftSig: software-exposed hardware signatures for code analysis and optimization
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Compiler and hardware support for reducing the synchronization of speculative threads
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 36th annual international symposium on Computer architecture
A case for an SC-preserving compiler
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
OUTRIDER: efficient memory latency tolerance with decoupled strands
Proceedings of the 38th annual international symposium on Computer architecture
Trimaran: an infrastructure for research in instruction-level parallelism
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Static analysis and compiler design for idempotent processing
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
SMARQ: Software-Managed Alias Register Queue for Dynamic Optimizations
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Runtime dependency analysis for loop pipelining in high-level synthesis
Proceedings of the 50th Annual Design Automation Conference
Hi-index | 0.01 |
To exploit instruction level parallelism, compilers for VLIW and superscalar processors often employ static code scheduling. However, the available code reordering may be severely restricted due to ambiguous dependences between memory instructions. This paper introduces a simple hardware mechanism, referred to as the memory conflict buffer, which facilitates static code scheduling in the presence of memory store/load dependences. Correct program execution is ensured by the memory conflict buffer and repair code provided by the compiler. With this addition, significant speedup over an aggressive code scheduling model can be achieved for both non-numerical and numerical programs.