Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
Cache design trade-offs for power and performance optimization: a case study
ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Way-predicting set-associative cache for high performance and low energy consumption
ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Selective cache ways: on-demand cache resource allocation
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
A way-halting cache for low-energy high-performance systems
Proceedings of the 2004 international symposium on Low power electronics and design
Applying decay to reduce dynamic power in set-associative caches
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Power and performance aware reconfigurable cache for CMPs
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Dynamic voltage and frequency scaling: the laws of diminishing returns
HotPower'10 Proceedings of the 2010 international conference on Power aware computing and systems
Communications of the ACM
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
Energy efficient united l2 cache design with instruction/data filter scheme
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Hi-index | 0.00 |
Power has been a big issue in processor design for several years. As caches account for more and more CPU die area and power, this paper presents using filtering unnecessary way accesses to reduce dynamic power consumption of unified L2 cache shared by instruction and data. Our methods include using Invalid Filter, which could eliminate accesses to cache ways contained invalid blocks, and I/D Filter, which could eliminate accesses to cache ways contained instruction/data access type mismatch blocks, and Tag-2 Filter, which could eliminate accesses to cache ways contained tag lowest 2 bits mismatch blocks. Since the methods reducing the activities happened in cache architecture, dynamical CPU power could be significantly decreased. In the paper, we also propose combining the above methods together(Invalid+I/D+Tag-2 Filter), which is called Way-Filtering Cache, in an attempt to achieve better power saving results. Our evaluations show that, we could obtain 19.6%-47.8% (which is on average 34.3%)improvement on a 64K-4way cache and 19.6%-55.2%(which is on average 39.2%) improvement on a 128k-8way cache comparing to Invalid+I/D Filter, and 6.1%-27.7%(which is on average 16.6%) improvement on a 64K-4way cache and 6.9%-44.4%(which is on average 25.0%) improvement on a 128k-8way cache comparing to Invalid+Tag-2 Filter, respectively. Also, comparing to Tag-Data caches, which is popular used in less-latency-sensitive caches(e.g. unified L2 or Shared Last-LevelCache), our Way-Filtering cache could get 18.3%-29.2%(which is on average 23.1%) improvement on 64K-4way cache, and 27.2% to 50.1%(which is on average 41.1%) improvement on 128K-8way cache.