Efficient (stack) algorithms for analysis of write-back and sector memories
ACM Transactions on Computer Systems (TOCS)
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Introduction to algorithms
Efficient trace-driven simulation methods for cache performance analysis
ACM Transactions on Computer Systems (TOCS)
Set-associative cache simulation using generalized binomial trees
ACM Transactions on Computer Systems (TOCS)
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Estimating cache misses and locality using stack distances
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Array regrouping and structure splitting using whole-program reference affinity
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Multiple Page Size Modeling and Optimization
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Quantifying Locality In The Memory Access Patterns of HPC Applications
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Feedback-directed memory disambiguation through store distance analysis
Proceedings of the 20th annual international conference on Supercomputing
Locality approximation using time
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Miss Rate Prediction Across Program Inputs and Cache Configurations
IEEE Transactions on Computers
CRAMM: virtual memory support for garbage-collected applications
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Accurate memory signatures and synthetic address traces for HPC applications
Proceedings of the 22nd annual international conference on Supercomputing
Sampling-based program locality approximation
Proceedings of the 7th international symposium on Memory management
Program locality analysis using reuse distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Performance of large low-associativity caches
ACM SIGMETRICS Performance Evaluation Review
Accelerating multicore reuse distance analysis with sampling and parallelization
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Studying inter-core data reuse in multicores
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Studying inter-core data reuse in multicores
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Efficient stack distance computation for priority replacement policies
Proceedings of the 8th ACM International Conference on Computing Frontiers
Path-Based reuse distance analysis
CC'06 Proceedings of the 15th international conference on Compiler Construction
Reuse distance based performance modeling and workload mapping
Proceedings of the 9th conference on Computing Frontiers
DaaC: device-reserved memory as an eviction-based file cache
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Optimal eviction policies for stochastic address traces
Theoretical Computer Science
Hi-index | 0.00 |
This paper1 describes our experience using the stack processing algorithm [6] for estimating the number of cache misses in scientific programs. By using a new data structure and various optimization techniques we obtain instrumented run-times within 50 to 100 times the original optimized run-times of our benchmarks.