Reducing energy of virtual cache synonym lookup using bloom filters

Authors:
Dong Hyuk Woo;Mrinmoy Ghosh;Emre Özer;Stuart Biles;Hsien-Hsin S. Lee
Affiliations:
Georgia Institute of Technology, Atlanta, GA;Georgia Institute of Technology, Atlanta, GA;ARM Ltd., Cambridge, UK;ARM Ltd., Cambridge, UK;Georgia Institute of Technology, Atlanta, GA
Venue:
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Year:
2006

Citing 21
Cited 4

Coherency for multiprocessor virtual address caches

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Organization and performance of a two-level virtual-real cache hierarchy

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Page placement algorithms for large real-indexed caches

ACM Transactions on Computer Systems (TOCS)
Way-predicting set-associative cache for high performance and low energy consumption

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Summary cache: a scalable wide-area web cache sharing protocol

IEEE/ACM Transactions on Networking (TON)
Cache Memories

ACM Computing Surveys (CSUR)
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Bloom filtering cache misses for accurate data speculation and prefetching

ICS '02 Proceedings of the 16th international conference on Supercomputing
Virtual-Address Caches Part 1: Problems and Solutions in Uniprocessors

IEEE Micro
U-cache: a cost-effective solution to synonym problem

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Just Say No: Benefits of Early Cache Miss Determination

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Dynamic Optimization of Micro-Operations

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
TAXI: Trace Analysis for X86 Interpretation

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
Spectral bloom filters

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Space-code bloom filter for efficient traffic flow measurement

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Scalable Hardware Memory Disambiguation for High ILP Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Fetch Halting on Critical Load Misses

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Store Vulnerability Window (SVW): Re-Execution Filtering for Enhanced Load Optimization

Proceedings of the 32nd annual international symposium on Computer Architecture
Deep Packet Inspection using Parallel Bloom Filters

IEEE Micro
Efficient system-on-chip energy management with a segmented bloom filter

ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems

Heterogeneously tagged caches for low-power embedded systems with virtual memory support

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Way guard: a segmented counting bloom filter approach to reducing energy for set-associative caches

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
TurboTag: lookup filtering to reduce coherence directory power

Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Reducing memory reference energy with opportunistic virtual caching

Proceedings of the 39th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Virtual caches are employed as L1 caches of both high performance and embedded processors to meet their short latency requirements. However, they also introduce the synonym problem where the same physical cache line can be present at multiple locations in the cache due to their distinct virtual addresses, leading to potential data consistency issues. To guarantee correctness, common hardware solutions either perform serial lookups for all possible synonym locations in the L1 consuming additional energy or employ a reverse map in the L2 cache that incurs a large area overhead. Such preventive mechanisms are nevertheless indispensable even though synonyms may not always be present during the execution.In this paper, we study the synonym issue using Windows applications workload and propose a technique based on Bloom filters to reduce synonym lookup energy. By tracking the address stream using Bloom filters, we can confidently exclude the addresses that were never observed to eliminate unnecessary synonym lookups, thereby saving energy in the L1 cache. Bloom filters have a very small area overhead making our implementation a feasible and attractive solution for synonym detection. Our results show that synonyms in these applications actually constitutes less than 0.1% of the total cache misses. By applying our technique, the dynamic energy consumed in L1 data cache can be reduced up to 32.5%. When taking leakage energy into account, the savings is up to 27.6%.