Tradeoffs in two-level on-chip caching

Authors:
N. P. Jouppi;S. J. E. Wilton
Affiliations:
Digital Equipment Corporation Western Research Lab, 250 University Avenue, Palo Alto, CA;Dept. of Electrical and Computer Engineering, University of Toronto, 10 King's College Rd., Toronto, Ontario, Canada M5S 1A4
Venue:
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Year:
1994

Citing 6
Cited 42

On the inclusion properties for multi-level cache hierarchies

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A Case for Direct-Mapped Caches

Computer
Cache and memory hierarchy design: a performance-directed approach

Cache and memory hierarchy design: a performance-directed approach
High-bandwidth data memory systems for superscalar processors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Performance optimization of pipelined primary cache

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture

Surpassing the TLB performance of superpages with less operating system support

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Active memory: a new abstraction for memory-system simulation

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Cache design trade-offs for power and performance optimization: a case study

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
Instruction fetching: coping with code bloat

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Direct-mapped versus set-associative pipelined caches

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
High-bandwidth address translation for multiple-issue processors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Evaluation of multithreaded uniprocessors for commercial application environments

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Improving cache performance with balanced tag and data paths

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Optimizing primary data caches for parallel scientific applications: the pool buffer approach

ICS '96 Proceedings of the 10th international conference on Supercomputing
An Analytical Model for Designing Memory Hierarchies

IEEE Transactions on Computers
Active memory: a new abstraction for memory system simulation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Designing high bandwidth on-chip caches

Proceedings of the 24th annual international symposium on Computer architecture
Retrospective: on the inclusion properties for multi-level cache hierarchies

25 years of the international symposia on Computer architecture (selected papers)
Cache-conscious data placement

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Capturing dynamic memory reference behavior with adaptive cache topology

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Functional Implementation Techniques for CPU Cache Memories

IEEE Transactions on Computers - Special issue on cache memory and related problems
A scalable front-end architecture for fast instruction delivery

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
The pool of subsectors cache design

ICS '99 Proceedings of the 13th international conference on Supercomputing
Piranha: a scalable architecture based on single-chip multiprocessing

Proceedings of the 27th annual international symposium on Computer architecture
High Bandwidth On-Chip Cache Design

IEEE Transactions on Computers
Optimizations Enabled by a Decoupled Front-End Architecture

IEEE Transactions on Computers
Measuring experimental error in microprocessor simulation

SSR '01 Proceedings of the 2001 symposium on Software reusability: putting software reuse in context
Measuring Experimental Error in Microprocessor Simulation

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Selective Victim Caching: A Method to Improve the Performance of Direct-Mapped Caches

IEEE Transactions on Computers
Leakage Energy Management in Cache Hierarchies

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Compiler-Directed Cache Assist Adaptivity

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
On cache memory hierarchy for Chip-Multiprocessor

ACM SIGARCH Computer Architecture News
Systematic objective-driven computer architecture optimization

ARVLSI '95 Proceedings of the 16th Conference on Advanced Research in VLSI (ARVLSI'95)
Incorporating Multi-Chip Module Packaging Constraints into System Design

EDTC '96 Proceedings of the 1996 European conference on Design and Test
A Measurement Study of Memory Transaction Characteristics on a PowerPC Based Macintosh

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
A Design Frame for Hybrid Access Cashes

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs

Proceedings of the 31st annual international symposium on Computer architecture
An Analysis of the Performance Impact of Wrong-Path Memory References on Out-of-Order and Runahead Execution Processors

IEEE Transactions on Computers
Using the first-level caches as filters to reduce the pollution caused by speculative memory references

International Journal of Parallel Programming
NCID: a non-inclusive cache, inclusive directory architecture for flexible and efficient cache hierarchies

Proceedings of the 7th ACM international conference on Computing frontiers
Avoiding cache thrashing due to private data placement in last-level cache for manycore scaling

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
PACMan: prefetch-aware cache management for high performance caching

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Exploring latency-power tradeoffs in deep nonvolatile memory hierarchies

Proceedings of the 9th conference on Computing Frontiers
FLEXclusion: balancing cache capacity and on-chip bandwidth via flexible exclusion

Proceedings of the 39th Annual International Symposium on Computer Architecture
Scale-out processors

Proceedings of the 39th Annual International Symposium on Computer Architecture
Temporal-based multilevel correlating inclusive cache replacement

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.01

Visualization

Abstract

The performance of two-level on-chip caching is investigated for a range of technology and architecture assumptions. The area and access time of each level of cache is modeled in detail. The results indicate that for most workloads, two-level cache configurations (with a set-associative second level) perform marginally better than single-level cache configurations that require the same chip area once the first-level cache sizes are 64KB or larger. Two-level configurations become even more important in systems with no off-chip cache and in systems in which the memory cells in the first-level caches are multiported and hence larger than those in the second-level cache. Finally, a new replacement policy called two-level exclusive caching is introduced. Two-level exclusive caching improves the performance of two-level caching organizations by increasing the effective associativity and capacity.