Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
A low power unified cache architecture providing power and performance flexibility (poster session)
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Multi-objective design space exploration using genetic algorithms
Proceedings of the tenth international symposium on Hardware/software codesign
Profiling tools for hardware/software partitioning of embedded applications
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 30th annual international symposium on Computer architecture
Automatic Tuning of Two-Level Caches to Embedded Applications
Proceedings of the conference on Design, automation and test in Europe - Volume 1
High level cache simulation for heterogeneous multiprocessors
Proceedings of the 41st annual Design Automation Conference
Cache Optimization For Embedded Processor Cores: An Analytical Approach
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
EaseCAM: An Energy and Storage Efficient TCAM-Based Router Architecture for IP Lookup
IEEE Transactions on Computers
Fast configurable-cache tuning with a unified second-level cache
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
StatCache: a probabilistic approach to efficient and accurate data locality analysis
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Platune: a tuning framework for system-on-a-chip platforms
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A table-based method for single-pass cache optimization
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Variable-sized object packing and its applications to instruction cache design
Computers and Electrical Engineering
On the interplay of loop caching, code compression, and cache configuration
Proceedings of the 16th Asia and South Pacific Design Automation Conference
T-SPaCS: a two-level single-pass cache simulation methodology
Proceedings of the 16th Asia and South Pacific Design Automation Conference
Dynamic Cache Reconfiguration for Soft Real-Time Systems
ACM Transactions on Embedded Computing Systems (TECS)
Survey of scheduling techniques for addressing shared resources in multicore processors
ACM Computing Surveys (CSUR)
Adaptive loop caching using lightweight runtime control flow analysis
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
A survey and taxonomy of on-chip monitoring of multicore systems-on-chip
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
Two-level caches tuning technique for energy consumption in reconfigurable embedded MPSoC
Journal of Systems Architecture: the EUROMICRO Journal
An analytical approach for fast and accurate design space exploration of instruction caches
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
We introduce a new non-intrusive on-chip cache-tuning hardware module capable of accurately predicting the best configuration of a configurable cache for an executing application. Previous dynamic cache tuning approaches change the cache configuration several times as part of the tuning search process, executing the application using inferior configurations and temporarily causing energy and performance overhead. The introduced tuner uses a different approach, which non-intrusively collects data on addresses issued by the microprocessor, analyzes that data to predict the best cache configuration, and then updates the cache to the new best configuration in "one-shot," without ever having to examine inferior configurations. The result is less energy and less performance overhead, meaning that cache tuning can be applied more frequently. We show through experiments that the one-shot cache tuner can reduce memory-access related energy for instructions by 35% and comes within 4% of a previous intrusive approach, and results in 4.6 times less energy overhead and a 7.7 times speedup in tuning time compared to a previous intrusive approach, at the main expense of 12% larger size.