Coupling compiler-enabled and conventional memory accessing for energy efficiency

Authors:
Raksit Ashok;Saurabh Chheda;Csaba Andras Moritz
Affiliations:
BlueRISC Inc., Hadley, MA;BlueRISC Inc., Hadley, MA;University of Massachusetts, Amherst, MA
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
2004

Citing 34
Cited 0

An in-cache address translation mechanism

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Software-controlled caches in the VMP multiprocessor

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Coherency for multiprocessor virtual address caches

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Organization and performance of a two-level virtual-real cache hierarchy

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
A simulation based study of TLB performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Consistency management for virtually indexed caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Evaluation of A+B=K Conditions Without Carry Propagation

IEEE Transactions on Computers
Threshold-voltage control schemes through substrate-bias for low-power high-speed CMOS LSI design

Journal of VLSI Signal Processing Systems - Special issue on technologies for wireless computing
Reducing TLB power requirements

ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor

Digital Technical Journal
Power considerations in the design of the Alpha 21264 microprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
Way-predicting set-associative cache for high performance and low energy consumption

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Selective cache ways: on-demand cache resource allocation

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
A recursive algorithm for low-power memory partitioning

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Cache Memories

ACM Computing Surveys (CSUR)
Dynamic zero compression for cache energy reduction

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Power aware microarchitecture resource scaling

Proceedings of the conference on Design, automation and test in Europe
Uniprocessor Virtual Memory without TLBs

IEEE Transactions on Computers
L1 data cache decomposition for energy efficiency

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Drowsy caches: simple techniques for reducing leakage power

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Design of High-Performance Microprocessor Circuits

Design of High-Performance Microprocessor Circuits
Reducing set-associative cache energy via way-prediction and selective direct-mapping

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Direct addressed caches for reduced power consumption

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Cool-cache for hot multimedia

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Managing leakage for transient data: decay and quasi-static 4T memory cells

Proceedings of the 2002 international symposium on Low power electronics and design
SPEC CPU2000: Measuring CPU Performance in the New Millennium

Computer
Software-Managed Address Translation

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article presents Cool-Mem, a family of memory system architectures that integrate conventional memory system mechanisms, energy-aware address translation, and compiler-enabled cache disambiguation techniques, to reduce energy consumption in general-purpose architectures. The solutions provided in this article leverage on interlayer tradeoffs between architecture, compiler, and operating system layers. Cool-Mem achieves power reduction by statically matching memory operations with energy-efficient cache and virtual memory access mechanisms. It combines statically speculative cache access modes, a dynamic content addressable memory-based (CAM-based) Tag-Cache used as backup for statically mispredicted accesses, different conventional multilevel associative cache organizations, embedded protection checking along all cache access mechanisms, as well as architectural organizations to reduce the power consumed by address translation in virtual memory. Because it is based on speculative static information, a superset of the predictable program information available at compile-time, our approach removes the burden of provable correctness in compiler analysis passes that extract static information. This makes Cool-Mem highly practical, applicable for large and complex applications, without having any limitations due to complexity issues in our compiler passes or the presence of precompiled static libraries. Based on extensive evaluation, for both SPEC2000 and Mediabench applications, we obtain from 6% to 19% total energy savings in the processor, with performance ranging from 1.5% degradation to 6% improvement, for the applications studied. We have also compared Cool-Mem to several prior arts and have found Cool-Mem to perform better in almost all cases.