PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Dynamic speculation and synchronization of data dependences
Proceedings of the 24th annual international symposium on Computer architecture
Memory dependence prediction using store sets
Proceedings of the 25th annual international symposium on Computer architecture
Speculation techniques for improving load related instruction scheduling
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Classifying load and store instructions for memory renaming
ICS '99 Proceedings of the 13th international conference on Supercomputing
Cost Effective Memory Dependence Prediction using Speculation Levels and Color Sets
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Reuse Distance-Based Cache Hint Selection
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Calculating stack distances efficiently
Proceedings of the 2002 workshop on Memory system performance
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Estimating cache misses and locality using stack distances
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Automatic Generation of Microarchitecture Simulators
ICCL '98 Proceedings of the 1998 International Conference on Computer Languages
Memory dependence prediction
Improving effective bandwidth through compiler enhancement of global and dynamic cache reuse
Improving effective bandwidth through compiler enhancement of global and dynamic cache reuse
Miss Rate Prediction across All Program Inputs
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Array regrouping and structure splitting using whole-program reference affinity
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Cross-architecture performance predictions for scientific applications using parameterized models
Proceedings of the joint international conference on Measurement and modeling of computer systems
Cooperative hardware/software caching for next-generation memory systems
Cooperative hardware/software caching for next-generation memory systems
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Reuse-distance-based miss-rate prediction on a per instruction basis
MSP '04 Proceedings of the 2004 workshop on Memory system performance
Instruction Based Memory Distance Analysis and its Application
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
DMDC: Delayed Memory Dependence Checking through Age-Based Filtering
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Is reuse distance applicable to data locality analysis on chip multiprocessors?
CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Automated locality optimization based on the reuse distance of string operations
CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
Feedback-directed optimization has developed into an increasingly important tool in designing optimizing compilers. Based upon profiling, memory distance analysis has shown much promise in predicting data locality and memory dependences, and has seen use in locality based optimizations and memory disambiguation. In this paper, we apply a form of memory distance, called store distance, to the problem of memory disambiguation in out-of-order issue processors. Store distance is defined as the number of store references between a load and the previous store accessing the same memory location. By generating a representative store distance for each load instruction, we can apply a compiler/micro-architecture cooperative scheme to direct run-time load speculation. Using store distance, the processor can, in most cases, accurately determine on which specific store instruction a load depends according to its store distance annotation. Our experiments show that the proposed store distance method performs much better than the previous distance based memory disambiguation scheme, and yields a performance very close to perfect memory disambiguation. The store distance based scheme also outperforms the store set technique with a relatively small predictor space and achieves performance comparable to that of a 16K-entry store set implementation for both floating point and integer programs.