Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Set-associative cache simulation using generalized binomial trees
ACM Transactions on Computer Systems (TOCS)
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Cache miss equations: a compiler framework for analyzing and tuning memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
A design framework to efficiently explore energy-delay tradeoffs
Proceedings of the ninth international symposium on Hardware/software codesign
AccuPower: An Accurate Power Estimation Tool for Superscalar Microprocessors
Proceedings of the conference on Design, automation and test in Europe
A fast and accurate framework to analyze and optimize cache memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Performance evaluation of cache replacement policies for the SPEC CPU2000 benchmark suite
ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
High level cache simulation for heterogeneous multiprocessors
Proceedings of the 41st annual Design Automation Conference
Design space exploration of caches using compressed traces
Proceedings of the 18th annual international conference on Supercomputing
Finding optimal L1 cache configuration for embedded systems
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Exact and fast L1 cache simulation for embedded systems
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
SuSeSim: a fast simulation strategy to find optimal L1 cache configuration for embedded systems
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Evaluation techniques for storage hierarchies
IBM Systems Journal
DEW: a fast level 1 cache simulation approach for embedded processors with FIFO replacement policy
Proceedings of the Conference on Design, Automation and Test in Europe
CIPARSim: cache intersection property assisted rapid single-pass FIFO cache simulation technique
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Embedded systems designers are free to choose the most suitable configuration of L1 cache in modern processor based SoCs. Choosing the appropriate L1 cache configuration necessitates the simulation of long memory access traces to accurately obtain hit/miss rates. The long execution time taken to simulate these traces, particularly separate simulation for each configuration is a major drawback. Researchers have proposed techniques to speed up the simulation of caches with LRU replacement policy. These techniques are of little use in the majority of embedded processors as these processors utilize Round-robin policy based caches. In this paper we propose a fast L1 cache simulation approach, called SCUD (Sorted Collection of Unique Data), for caches with the Round-robin policy. SCUD is a single-pass cache simulator that can simulate multiple L1 cache configurations (with varying set sizes and associativities) by reading the application trace once. Utilizing fast binary searches in a novel data structure, SCUD simulates an application trace significantly faster than a widely used single configuration cache simulator (Dinero IV). We show SCUD can simulate a set of cache configurations up to 57 times faster than Dinero IV. SCUD shows an average speed up of 19.34 times over Dinero IV for Mediabench applications, and an average speed up of over 10 times for SPEC CPU2000 applications.