Designing a practical data filter cache to improve both energy efficiency and performance

Authors:
Alen Bardizbanyan;Magnus Själander;David Whalley;Per Larsson-Edefors
Affiliations:
Chalmers University of Technology, Gothenburg, Sweden;Florida State University, FL, USA;Florida State University, FL, USA;Chalmers University of Technology, Gothenburg, Sweden
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2013

Citing 24
Cited 0

A portable global optimizer and linker

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Cache design trade-offs for power and performance optimization: a case study

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Streamlining data cache access with fast address calculation

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The filter cache: an energy efficient memory structure

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Way-predicting set-associative cache for high performance and low energy consumption

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Filtering Memory References to Increase Energy Efficiency

IEEE Transactions on Computers
Reducing set-associative cache energy via way-prediction and selective direct-mapping

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Virtual-Address Caches Part 1: Problems and Solutions in Uniprocessors

IEEE Micro
Design of a Predictive Filter Cache for Energy Savings in High Performance Processor Architectures

ICCD '01 Proceedings of the International Conference on Computer Design: VLSI in Computers & Processors
A way-halting cache for low-energy high-performance systems

ACM Transactions on Architecture and Code Optimization (TACO)
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Guaranteeing Hits to Improve the Efficiency of a Small Instruction Cache

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient Embedded Computing

Computer
Guaranteeing instruction fetch behavior with a lookahead instruction fetch engine (LIFE)

Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Two new techniques integrated for energy-efficient TLB design

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Understanding sources of inefficiency in general-purpose chips

Proceedings of the 37th annual international symposium on Computer architecture
Computer Architecture, Fifth Edition: A Quantitative Approach

Computer Architecture, Fifth Edition: A Quantitative Approach
Scaling with Design Constraints: Predicting the Future of Big Chips

IEEE Micro
Reducing memory reference energy with opportunistic virtual caching

Proceedings of the 39th Annual International Symposium on Computer Architecture
Revisiting level-0 caches in embedded processors

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Towards a performance- and energy-efficient data filter cache

Proceedings of the 10th Workshop on Optimizations for DSP and Embedded Systems
An energy-efficient L2 cache architecture using way tag information under write-through policy

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Conventional Data Filter Cache (DFC) designs improve processor energy efficiency, but degrade performance. Furthermore, the single-cycle line transfer suggested in prior studies adversely affects Level-1 Data Cache (L1 DC) area and energy efficiency. We propose a practical DFC that is accessed early in the pipeline and transfers a line over multiple cycles. Our DFC design improves performance and eliminates a substantial fraction of L1 DC accesses for loads, L1 DC tag checks on stores, and data translation lookaside buffer accesses for both loads and stores. Our evaluation shows that the proposed DFC can reduce the data access energy by 42.5% and improve execution time by 4.2%.