Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
MIPS RISC architectures
Alpha architecture reference manual
Alpha architecture reference manual
Automatic and interactive parallelization
Automatic and interactive parallelization
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Supercomputer performance evaluation and the Perfect Benchmarks
ICS '90 Proceedings of the 4th international conference on Supercomputing
ACM Computing Surveys (CSUR)
Concrete Math
On Estimating and Enhancing Cache Effectiveness
Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
Aspects of cache memory and instruction buffer performance
Aspects of cache memory and instruction buffer performance
Influence of cross-interferences on blocked loops: a case study with matrix-vector multiply
ACM Transactions on Programming Languages and Systems (TOPLAS)
The influence of caches on the performance of heaps
Journal of Experimental Algorithmics (JEA)
A quantitative analysis of loop nest locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Improving single-process performance with multithreaded processors
ICS '96 Proceedings of the 10th international conference on Supercomputing
Cache miss equations: an analytical representation of cache misses
ICS '97 Proceedings of the 11th international conference on Supercomputing
Data transformations for eliminating conflict misses
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Modeling set associative caches behavior for irregular computations
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Eliminating conflict misses for high performance architectures
ICS '98 Proceedings of the 12th international conference on Supercomputing
Precise miss analysis for program transformations with caches of arbitrary associativity
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache performance analysis of traversals and random accesses
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Analytical Modeling of Set-Associative Cache Behavior
IEEE Transactions on Computers
Cache miss equations: a compiler framework for analyzing and tuning memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Locality optimizations for multi-level caches
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Automated cache optimizations using CME driven diagnosis
Proceedings of the 14th international conference on Supercomputing
Symbolic Cache Analysis for Real-Time Systems
Real-Time Systems - Special issue on worst-case execution-time analysis
Tiling optimizations for 3D scientific computations
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Morphable Cache Architectures: Potential Benefits
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Compiler-directed cache polymorphism
Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Compilation of Vector Statements of C[] Language for Architectures with Multilevel Memory Hierarchy
Programming and Computing Software
Computer
Probabilistic Miss Equations: Evaluating Memory Hierarchy Performance
IEEE Transactions on Computers
A Compiler Framework for Tiling Imperfectly-Nested Loops
LCPC '99 Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing
A Blocked All-Pairs Shortest-Path Algorithm
SWAT '00 Proceedings of the 7th Scandinavian Workshop on Algorithm Theory
A Performance Estimator for Parallel Programs
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Set Associative Cache Behavior Optimization
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Improving Cache Effectiveness through Array Data Layout Manipulation in SAC
IFL '00 Selected Papers from the 12th International Workshop on Implementation of Functional Languages
Predicting the impact of optimizations for embedded systems
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Highly accurate and efficient evaluation of randomising set index functions
Journal of Systems Architecture: the EUROMICRO Journal
A Quantitative Analysis of Tile Size Selection Algorithms
The Journal of Supercomputing
A fast and accurate framework to analyze and optimize cache memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient and Accurate Analytical Modeling of Whole-Program Data Cache Behavior
IEEE Transactions on Computers
A compiler tool to predict memory hierarchy performance of scientific codes
Parallel Computing
A blocked all-pairs shortest-paths algorithm
Journal of Experimental Algorithmics (JEA)
Quasidynamic Layout Optimizations for Improving Data Locality
IEEE Transactions on Parallel and Distributed Systems
Automatic tiling of iterative stencil loops
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Geometric Programming Framework for Optimal Multi-Level Tiling
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Analyzing data reuse for cache reconfiguration
ACM Transactions on Embedded Computing Systems (TECS)
Analytical modeling of codes with arbitrary data-dependent conditional structures
Journal of Systems Architecture: the EUROMICRO Journal
Profitable loop fusion and tiling using model-driven empirical search
Proceedings of the 20th annual international conference on Supercomputing
Precise automatable analytical modeling of the cache behavior of codes with indirections
ACM Transactions on Architecture and Code Optimization (TACO)
Cache-aware partitioning of multi-dimensional iteration spaces
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
Cache line reservation: exploring a scheme for cache-friendly object allocation
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Tuning blocked array layouts to exploit memory hierarchy in SMT architectures
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Hi-index | 0.01 |
The impact of cache interferences on program performance (particularly numerical codes, which heavily use the memory hierarchy) remains unknown. The general knowledge is that cache interferences are highly irregular, in terms of occurrence and intensity. In this paper, the different types of cache interferences that can occur in numerical loop nests are identified. An analytical method is developed for detecting the occurrence of interferences and, more important, for computing the number of cache misses due to interferences. Simulations and experiments on real machines show that the model is generally accurate and that most interference phenomena are captured. Experiments also show that cache interferences can be intense and frequent. Certain parameters such as array base addresses or dimensions can have a strong impact on the occurrence of interferences. Modifying these parameters only can induce global execution time variations of 30% and more. Applications of these modeling techniques are numerous and range from performance evaluation and prediction to enhancement of data locality optimizations techniques.