Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
Set-associative cache simulation using generalized binomial trees
ACM Transactions on Computer Systems (TOCS)
Empirical study of parallel trace-driven LRU cache simulators
PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Cache miss equations: a compiler framework for analyzing and tuning memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
Parallel trace-driven cache simulation by time partitioning
WSC' 90 Proceedings of the 22nd conference on Winter simulation
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
A design framework to efficiently explore energy-delay tradeoffs
Proceedings of the ninth international symposium on Hardware/software codesign
AccuPower: An Accurate Power Estimation Tool for Superscalar Microprocessors
Proceedings of the conference on Design, automation and test in Europe
A fast and accurate framework to analyze and optimize cache memory behavior
ACM Transactions on Programming Languages and Systems (TOPLAS)
High level cache simulation for heterogeneous multiprocessors
Proceedings of the 41st annual Design Automation Conference
Design space exploration of caches using compressed traces
Proceedings of the 18th annual international conference on Supercomputing
Analytical Design Space Exploration of Caches for Embedded Systems
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Instruction trace compression for rapid instruction cache simulation
Proceedings of the conference on Design, automation and test in Europe
A table-based method for single-pass cache optimization
Proceedings of the 18th ACM Great Lakes symposium on VLSI
HitME: low power Hit MEmory buffer for embedded systems
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Exact and fast L1 cache simulation for embedded systems
Proceedings of the 2009 Asia and South Pacific Design Automation Conference
SuSeSim: a fast simulation strategy to find optimal L1 cache configuration for embedded systems
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Cache line reservation: exploring a scheme for cache-friendly object allocation
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Proceedings of the 47th Design Automation Conference
DEW: a fast level 1 cache simulation approach for embedded processors with FIFO replacement policy
Proceedings of the Conference on Design, Automation and Test in Europe
T-SPaCS: a two-level single-pass cache simulation methodology
Proceedings of the 16th Asia and South Pacific Design Automation Conference
HC-Sim: a fast and exact l1 cache simulator with scratchpad memory co-simulation support
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
CIPARSim: cache intersection property assisted rapid single-pass FIFO cache simulation technique
Proceedings of the International Conference on Computer-Aided Design
DIMSim: a rapid two-level cache simulation approach for deadline-based MPSoCs
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Modern embedded system execute a single application or a class of applications repeatedly. A new emerging methodology of designing embedded system utilizes configurable processors where the cache size, associativity, and line size can be chosen by the designer. In this paper, a method is given to rapidly find the L1 cache miss rate of an application. An energy model and an execution time model are developed to find the best cache configuration for the given embedded application. Using benchmarks from Mediabench, we find that our method is on average 45 times faster to explore the design space, compared to Dinero IV while still having 100% accuracy.