Just Say No: Benefits of Early Cache Miss Determination

Authors:
Gokhan Memik;Glenn Reinman;William H. Mangione-Smith
Affiliations:
-;-;-
Venue:
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Year:
2003

Citing 9
Cited 18

Computer architecture: a quantitative approach

Computer architecture: a quantitative approach
High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Complexity/performance tradeoffs with non-blocking loads

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Reducing set-associative cache energy via way-prediction and selective direct-mapping

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Reactive-Associative Caches

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Lockup-free instruction fetch/prefetch cache organization

ISCA '81 Proceedings of the 8th annual symposium on Computer Architecture
How Useful Are Non-Blocking Loads, Stream Buffers and Speculative Execution in Multiple Issue Processors?

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Predictive sequential associative cache

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture

Reducing energy and delay using efficient victim caches

Proceedings of the 2003 international symposium on Low power electronics and design
Scaling the issue window with look-ahead latency prediction

Proceedings of the 18th annual international conference on Supercomputing
Location cache: a low-power L2 cache system

Proceedings of the 2004 international symposium on Low power electronics and design
A Hardware-Software Platform for Intrusion Prevention

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Tornado warning: the perils of selective replay in multithreaded processors

Proceedings of the 19th annual international conference on Supercomputing
Reducing the Energy of Speculative Instruction Schedulers

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Reducing energy of virtual cache synonym lookup using bloom filters

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Heterogeneous way-size cache

Proceedings of the 20th annual international conference on Supercomputing
Adaptive set pinning: managing shared caches in chip multiprocessors

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Way guard: a segmented counting bloom filter approach to reducing energy for set-associative caches

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
An energy-delay efficient 2-level data cache architecture for embedded system

Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Characterization and exploitation of narrow-width loads: the narrow-width cache approach

CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines

Proceedings of the 2nd ACM Symposium on Cloud Computing
Enhancing last-level cache performance by block bypassing and early miss determination

ACSAC'06 Proceedings of the 11th Asia-Pacific conference on Advances in Computer Systems Architecture
Efficient system-on-chip energy management with a segmented bloom filter

ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems
Residue cache: a low-energy low-area L2 cache architecture via compression and partial hits

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
SkipCache: miss-rate aware cache management

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
A dual grain hit-miss detector for large die-stacked DRAM caches

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

As the performance gap between the processor cores and the memory subsystem increases, designers are forced to develop new latency hiding techniques. Arguably, the most common technique is to utilize multi-level caches. Each new generation of processors is equippedwith higher levels of memory hierarchy with increasing sizes at each level. In this paper, we propose 5 different techniques that will reduce the data access times and power consumption in processors with multi-level caches. Using the information about the blocks placed into and replaced from the caches, the techniques quickly determine whether an access at any cache level will be a miss. The accesses that are identified to miss are aborted. The structures usedto recognize misses are much smaller than the cache structures. Consequently the data access times and power consumption is reduced. Using SimpleScalar simulator, we study the performance of these techniques for a processor with 5 cache levels. The best technique is able to abort 53.1% of the misses on average in SPEC2000 applications. Using these techniques, the execution time of the applications are reduced by up to 12.4% (5.4% on average), and the power consumption of the caches is reduced by as much as 11.6% (3.8% on average).