Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor
Digital Technical Journal - Special 10th anniversary issue
The filter cache: an energy efficient memory structure
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Adapting cache line size to application behavior
ICS '99 Proceedings of the 13th international conference on Supercomputing
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Way-predicting set-associative cache for high performance and low energy consumption
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
A low power unified cache architecture providing power and performance flexibility (poster session)
ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
A Reconfigurable multifunction computing cache architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
DRG-cache: a data retention gated-ground cache for low power
Proceedings of the 39th annual Design Automation Conference
Reducing set-associative cache energy via way-prediction and selective direct-mapping
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
SH3: High Code Density, Low Power
IEEE Micro
Adaptive Mode Control: A Static-Power-Efficient Cache Design
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Predictive sequential associative cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Energy Benefits of a Configurable Line Size Cache for Embedded Systems
ISVLSI '03 Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI'03)
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
A self-tuning cache architecture for embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
A way-halting cache for low-energy high-performance systems
ACM Transactions on Architecture and Code Optimization (TACO)
Configurable cache subsetting for fast cache tuning
Proceedings of the 43rd annual Design Automation Conference
The effect of temperature on cache size tuning for low energy embedded systems
Proceedings of the 17th ACM Great Lakes symposium on VLSI
A cache design for high performance embedded systems
Journal of Embedded Computing - Cache exploitation in embedded systems
Automatic cache tuning for energy-efficiency using local regression modeling
Proceedings of the 44th annual Design Automation Conference
Improving SDRAM access energy efficiency for low-power embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
A table-based method for single-pass cache optimization
Proceedings of the 18th ACM Great Lakes symposium on VLSI
Proceedings of the 13th international symposium on Low power electronics and design
Instruction Cache Tuning for Embedded Multitasking Applications
RSP '09 Proceedings of the 2009 IEEE/IFIP International Symposium on Rapid System Prototyping
Reducing energy in instruction caches by using multiple line buffers with prediction
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Proceedings of the 48th Design Automation Conference
FFT-cache: a flexible fault-tolerant cache architecture for ultra low voltage operation
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Dynamic Cache Reconfiguration for Soft Real-Time Systems
ACM Transactions on Embedded Computing Systems (TECS)
Energy- and performance-aware scheduling of tasks on parallel and distributed systems
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Shrinking l1 instruction caches to improve energy: delay in SMT embedded processors
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Thread-criticality aware dynamic cache reconfiguration in multi-core system
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
Energy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is determined, to a large degree, by the behavior of the application executing on the architecture. Desktop systems have to accommodate a very wide range of applications and therefore the cache architecture is usually set by the manufacturer as a best compromise given current applications, technology, and cost. Unlike desktop systems, embedded systems are designed to run a small range of well-defined applications. In this context, a cache architecture that is tuned for that narrow range of applications can have both increased performance as well as lower energy consumption. We introduce a novel cache architecture intended for embedded microprocessor platforms. The cache has three software-configurable parameters that can be tuned to particular applications. First, the cache's associativity can be configured to be direct-mapped, two-way, or four-way set-associative, using a novel technique we call way concatenation. Second, the cache's total size can be configured by shutting down ways. Finally, the cache's line size can be configured to have 16, 32, or 64 bytes. A study of 23 programs drawn from Powerstone, MediaBench, and Spec2000 benchmark suites shows that the configurable cache tuned to each program saved energy for every program compared to a conventional four-way set-associative cache as well as compared to a conventional direct-mapped cache, with an average savings of energy related to memory access of over 40%.