Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems
IEEE Transactions on Computers
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Efficient trace-driven simulation method for cache performance analysis
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Efficiently counting program events with support for on-line queries
ACM Transactions on Programming Languages and Systems (TOPLAS)
Set-associative cache simulation using generalized binomial trees
ACM Transactions on Computer Systems (TOCS)
A probabilistic method for calculating hit ratios in direct mapped caches
Journal of Network and Computer Applications
Iterative cache simulation of embedded CPUs with trace stripping
CODES '99 Proceedings of the seventh international workshop on Hardware/software codesign
Hardware-software co-design of embedded reconfigurable architectures
Proceedings of the 37th Annual Design Automation Conference
Analytical Design Space Exploration of Caches for Embedded Systems
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
StatCache: a probabilistic approach to efficient and accurate data locality analysis
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Instruction trace compression for rapid instruction cache simulation
Proceedings of the conference on Design, automation and test in Europe
Cache modeling in probabilistic execution time analysis
Proceedings of the 45th annual Design Automation Conference
Improved procedure placement for set associative caches
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
Precise timing analysis for direct-mapped caches
Proceedings of the 50th Annual Design Automation Conference
An analytical approach for fast and accurate design space exploration of instruction caches
ACM Transactions on Embedded Computing Systems (TECS)
An efficient compiler framework for cache bypassing on GPUs
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip estate. Simulation, in particular trace-driven simulation, is widely used to estimate cache hit rates. However, simulation is too slow to be deployed in the design space exploration, specially when it involves hundreds of design points and huge traces or long program execution. In this paper, we propose a novel static analysis technique for rapid and accurate design space exploration of instruction caches. Given the program control flow graph (CFG) annotated only with basic block and control flow edge execution counts, our analysis estimates the hit rates for multiple cache configurations in one pass. We achieve this by modeling the cache states at each node of the CFG in probabilistic manner and exploiting the structural similarities among related cache configurations. Experimental results indicate that our analysis is 24--3,855 times faster compared to the fastest known cache simulator while maintaining high accuracy (0.7% average error), in predicting hit rates for popular embedded benchmarks.