A compiler approach for reducing data cache energy

Authors:
W. Zhang;M. Karakoy;M. Kandemir;G. Chen
Affiliations:
Penn State University, University Park, PA;Imperial College, London, UK;Penn State University, University Park, PA;Penn State University, University Park, PA
Venue:
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Year:
2003

Citing 33
Cited 8

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
The cache performance and optimizations of blocked algorithms

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A data locality optimizing algorithm

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Tiling multidimensional iteration spaces for nonshared memory machines

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Design and evaluation of a compiler algorithm for prefetching

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
VLIW compilation techniques in a superscalar environment

PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
Compiling for numa parallel machines

Compiling for numa parallel machines
Supporting dynamic data structures on distributed-memory machines

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler-based prefetching for recursive data structures

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Threshold-voltage control schemes through substrate-bias for low-power high-speed CMOS LSI design

Journal of VLSI Signal Processing Systems - Special issue on technologies for wireless computing
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
Parallelizing nonnumerical code with selective scheduling and software pipelining

ACM Transactions on Programming Languages and Systems (TOPLAS)
Data transformations for eliminating conflict misses

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Advanced compiler design and implementation

Advanced compiler design and implementation
System-level power optimization: techniques and tools

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
A static power model for architects

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Reducing leakage in a high-performance deep-submicron instruction cache

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
Cache decay: exploiting generational behavior to reduce cache leakage power

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Dynamic fine-grain leakage reduction using leakage-biased bitlines

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Design of High-Performance Microprocessor Circuits

Design of High-Performance Microprocessor Circuits
Exploiting VLIW schedule slacks for dynamic and leakage energy reduction

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Maximizing Multiprocessor Performance with the SUIF Compiler

Computer
Optimal Software Pipelining of Nested Loops

Proceedings of the 8th International Symposium on Parallel Processing
Adaptive Mode Control: A Static-Power-Efficient Cache Design

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Leakage Energy Management in Cache Hierarchies

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Energy efficient frequent value data cache design

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Compiler-directed instruction cache leakage optimization

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Improving Software Pipelining With Unroll-and-Jam

HICSS '96 Proceedings of the 29th Hawaii International Conference on System Sciences Volume 1: Software Technology and Architecture
Integrating Loop and Data Transformations for Global Optimisation

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques

A trace-based binary compilation framework for energy-aware computing

Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Energy management in software-controlled multi-level memory hierarchies

GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Transition aware scheduling: increasing continuous idle-periods in resource units

Proceedings of the 2nd conference on Computing frontiers
Power reduction techniques for microprocessor systems

ACM Computing Surveys (CSUR)
Exploiting loop behavior for data cache leakage reduction

Journal of Embedded Computing - Cache exploitation in embedded systems
Hardware-compiler co-design for adjustable data power savings

Microprocessors & Microsystems
Cache leakage management for multi-programming workloads

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
Link-time optimization for power efficiency in a tagless instruction cache

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Silicon technology advances have made it possible to pack millions of transistors --- switching at high clock speeds --- on a single chip. While these advances bring unprecedented performance to electronic products, they pose difficult power/energy consumption problems. For example, large number of transistors in dense on-chip cache memories consume significant static (leakage) power even if the cache is not used by the current computation. While previous compiler research studied code and data restructuring for improving data cache performance, to our knowledge, there is no compiler-based study that targets data cache leakage power consumption. In this paper, we present code restructuring techniques for array-based and pointer-intensive applications for reducing data cache energy consumption. The idea is to let the compiler to analyze the code and insert instructions that turn off cache lines that keep variables not used by the current computation. This turning off does not destroy contents of a cache line, and waking up the cache line incurs very little overhead. Due to data locality, we find that at a given time only a small portion of the data cache needs to be active; the remaining part can be placed into a leakage-saving mode (state); i.e., they can be turned off. Our preliminary results indicate that the proposed strategy reduces the cache energy consumption significantly. We also show that several compiler optimizations increase the effectiveness of our strategy.