Probabilistic Miss Equations: Evaluating Memory Hierarchy Performance
IEEE Transactions on Computers
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Data cache locking for higher program predictability
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Estimating cache misses and locality using stack distances
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Data Caches in Multitasking Hard Real-Time Systems
RTSS '03 Proceedings of the 24th IEEE International Real-Time Systems Symposium
A fast and accurate framework to analyze and optimize cache memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient and Accurate Analytical Modeling of Whole-Program Data Cache Behavior
IEEE Transactions on Computers
A compiler tool to predict memory hierarchy performance of scientific codes
Parallel Computing
Compiler Estimation of Load Imbalance Overhead in Speculative Parallelization
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Line Size Adaptivity Analysis of Parameterized Loop Nests for Direct Mapped Data Cache
IEEE Transactions on Computers
Predicting Cache Space Contention in Utility Computing Servers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Fast data-locality profiling of native execution
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Generating cache hints for improved program efficiency
Journal of Systems Architecture: the EUROMICRO Journal
Relative factors in performance analysis of Java virtual machines
Proceedings of the 2nd international conference on Virtual execution environments
Analytical modeling of codes with arbitrary data-dependent conditional structures
Journal of Systems Architecture: the EUROMICRO Journal
Complete inlining of recursive calls: beyond tail-recursion elimination
Proceedings of the 44th annual Southeast regional conference
CMP cache performance projection: accessibility vs. capacity
ACM SIGARCH Computer Architecture News
A compiler cost model for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO)
Precise automatable analytical modeling of the cache behavior of codes with indirections
ACM Transactions on Architecture and Code Optimization (TACO)
Data cache locking for tight timing calculations
ACM Transactions on Embedded Computing Systems (TECS)
Tightening the bounds on feasible preemptions
ACM Transactions on Embedded Computing Systems (TECS)
Near-optimal padding for removing conflict misses
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Integrating software caches with scratch pad memory
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Hi-index | 0.01 |
Based on a new characterisation of data reuse across multiple loop nests, we present a method, a prototyping implementation and some experimental results for analysing the cache behaviour of whole programs with regular computations. Validation against cache simulation using real codes shows the efficiency and accuracy of our method. The largest program we have analysed, Applu from SPECfp95, has 3868 lines, 16 subroutines and 2565 references. In the case of a 32KB cache with a 32B line size, our method obtains the miss ratio with an absolute error of about 0.80\% in about 128 seconds while the simulator used runs for nearly 5 hours on a 933MHz Pentium III PC. Our method can be used to guide compiler locality optimisations and improve cache simulation performance.