Improving power efficiency of D-NUCA caches

Authors:
A. Bardine;P. Foglia;G. Gabrielli;C. A. Prete;P. Stenström
Affiliations:
Università di Pisa, Via Diotisalvi, Pisa (Italy);Università di Pisa, Via Diotisalvi, Pisa (Italy);Università di Pisa, Via Diotisalvi, Pisa (Italy);Università di Pisa, Via Diotisalvi, Pisa (Italy);Chalmers University of Technology, Gothenburg (Sweden)
Venue:
ACM SIGARCH Computer Architecture News
Year:
2007

Citing 17
Cited 1

Selective cache ways: on-demand cache resource allocation

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Clock rate versus IPC: the end of the road for conventional microarchitectures

Proceedings of the 27th annual international symposium on Computer architecture
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Let caches decay: reducing leakage energy via exploitation of cache generational behavior

ACM Transactions on Computer Systems (TOCS)
Drowsy caches: simple techniques for reducing leakage power

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Static energy reduction techniques for microprocessor caches

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special section on the 2001 international conference on computer design (ICCD)
Comparing Program Phase Detection Techniques

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Distance Associativity for High-Performance Energy-Efficient Non-Uniform Cache Architectures

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Controlling leakage power with the replacement policy in slumberous caches

Proceedings of the 2nd conference on Computing frontiers
Exploring the limits of leakage power reduction in caches

ACM Transactions on Architecture and Code Optimization (TACO)
Locality analysis to control dynamically way-adaptable caches

MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Power reduction techniques for microprocessor systems

ACM Computing Surveys (CSUR)
A cache design for high performance embedded systems

Journal of Embedded Computing - Cache exploitation in embedded systems
Nonuniform Cache Architectures for Wire-Delay Dominated On-Chip Caches

IEEE Micro

Modeling of cache access behavior based on Zipf's law

Proceedings of the 9th workshop on MEmory performance: DEaling with Applications, systems and architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

D-NUCA caches are cache memories that, thanks to banked organization, broadcast search and promotion/demotion mechanism, are able to tolerate the increasing wire delay effects introduced by technology scaling. As a consequence, they will outperform conventional caches (UCA, Uniform Cache Architectures) in future generation cores. Due to the promotion/demotion mechanism, we have found that, in a D-NUCA cache, the distribution of hits on the ways varies across applications as well as across different execution phases within a single application. In this paper, we show how such a behavior can be utilized to improve D-NUCA power efficiency as well as to decrease its access latencies. In particular, we propose a new D-NUCA structure, called Way Adaptable D-NUCA cache, in which the number of active (i.e. powered-on) ways is dynamically adapted to the need of the running application. Our initial evaluation shows that a consistent reduction of both the average number of active ways (42% in average) and the number of bank access requests (29% in average) is achieved, without significantly affecting the IPC.