PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches

Authors:
Yuejian Xie;Gabriel H. Loh
Affiliations:
Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA
Venue:
Proceedings of the 36th annual international symposium on Computer architecture
Year:
2009

Citing 30
Cited 60

Optimal Partitioning of Cache Memory

IEEE Transactions on Computers
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Dead-block prediction & dead-block correlating prefetchers

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Cache decay: exploiting generational behavior to reduce cache leakage power

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Timekeeping in the memory system: predicting and optimizing memory behavior

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
SimpleScalar: An Infrastructure for Computer System Modeling

Computer
Extending the reach of microprocessors: column and curious caching

Extending the reach of microprocessors: column and curious caching
Dynamic Partitioning of Shared Cache Memory

The Journal of Supercomputing
CQoS: a framework for enabling QoS in shared caches of CMP platforms

Proceedings of the 18th annual international conference on Supercomputing
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
IATAC: a smart predictor to turn-off L2 cache lines

ACM Transactions on Architecture and Code Optimization (TACO)
Trace Cache Sampling Filter

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Counter-Based Cache Replacement Algorithms

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A Case for MLP-Aware Cache Replacement

Proceedings of the 33rd annual international symposium on Computer Architecture
Architectural support for operating system-driven CMP cache management

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
ParallAX: an architecture for real-time physics

Proceedings of the 34th annual international symposium on Computer architecture
Adaptive insertion policies for high performance caching

Proceedings of the 34th annual international symposium on Computer architecture
QoS policies and architecture for cache/memory in CMP platforms

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Cooperative cache partitioning for chip multiprocessors

Proceedings of the 21st annual international conference on Supercomputing
A Framework for Providing Quality of Service in Chip Multi-Processors

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Counter-Based Cache Replacement and Bypassing Algorithms

IEEE Transactions on Computers
Adaptive set pinning: managing shared caches in chip multiprocessors

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Adaptive insertion policies for managing shared caches

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Cache bursts: A new approach for eliminating dead blocks and increasing cache efficiency

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
FacePerf: Benchmarks for Face Recognition Algorithms

IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization

Pseudo-LIFO: the foundation of a new family of replacement policies for last-level caches

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Q-clouds: managing performance interference effects for QoS-aware clouds

Proceedings of the 5th European conference on Computer systems
Augmenting cache partitioning with thread-aware insertion/promotion policies to manage shared caches

Proceedings of the 7th ACM international conference on Computing frontiers
Global management of cache hierarchies

Proceedings of the 7th ACM international conference on Computing frontiers
High performance cache replacement using re-reference interval prediction (RRIP)

Proceedings of the 37th annual international symposium on Computer architecture
Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications

Proceedings of the 37th annual international symposium on Computer architecture
NUcache: a multicore cache organization based on next-use distance

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Reducing Cache Pollution Through Detection and Elimination of Non-Temporal Memory Accesses

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The ZCache: Decoupling Ways and Associativity

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Management policies analysis for multi-core shared caches

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches

ACM Transactions on Architecture and Code Optimization (TACO)
Vantage: scalable and efficient fine-grain cache partitioning

Proceedings of the 38th annual international symposium on Computer architecture
Evaluating placement policies for managing capacity sharing in CMP architectures with private caches

ACM Transactions on Architecture and Code Optimization (TACO)
Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines

Proceedings of the 2nd ACM Symposium on Cloud Computing
CacheVisor: a toolset for visualizing shared caches in multicore and multithreaded processors

PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Enhanced adaptive insertion policy for shared caches

APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
Non-monopolizable caches: Low-complexity mitigation of cache side channel attacks

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Paging for multi-core shared caches

Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
CRUISE: cache replacement and utility-aware scheduling

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Bubble-Up: increasing utilization in modern warehouse scale computers via sensible co-locations

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
SHiP: signature-based hit predictor for high performance caching

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
PACMan: prefetch-aware cache management for high performance caching

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Scalable shared-cache management by containing thrashing workloads

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
A capacity-efficient insertion policy for dynamic cache resizing mechanisms

Proceedings of the 9th conference on Computing Frontiers
Courteous cache sharing: being nice to others in capacity management

Proceedings of the 49th Annual Design Automation Conference
Reducing last level cache pollution through OS-level software-controlled region-based partitioning

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Locality & utility co-optimization for practical capacity management of shared last level caches

Proceedings of the 26th ACM international conference on Supercomputing
FLEXclusion: balancing cache capacity and on-chip bandwidth via flexible exclusion

Proceedings of the 39th Annual International Symposium on Computer Architecture
Probabilistic shared cache management (PriSM)

Proceedings of the 39th Annual International Symposium on Computer Architecture
Optimal bypass monitor for high performance last-level caches

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
The evicted-address filter: a unified mechanism to address both cache pollution and thrashing

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Energy-efficient cache partitioning for future CMPs

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Survey of scheduling techniques for addressing shared resources in multicore processors

ACM Computing Surveys (CSUR)
PRETI: partitioned real-time shared cache for mixed-criticality real-time systems

Proceedings of the 20th International Conference on Real-Time and Network Systems
Exploiting reuse locality on inclusive shared last-level caches

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
An application-aware cache replacement policy for last-level caches

ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
To hardware prefetch or not to prefetch?: a virtualized environment study and core binding approach

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Improving Cache Management Policies Using Dynamic Reuse Distances

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Interference and locality-aware task scheduling for MapReduce applications in virtual clusters

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
A survey on cache tuning from a power/energy perspective

ACM Computing Surveys (CSUR)
Adaptive cache management for a combined SRAM and DRAM cache hierarchy for multi-cores

Proceedings of the Conference on Design, Automation and Test in Europe
OAP: an obstruction-aware cache management policy for STT-RAM last-level caches

Proceedings of the Conference on Design, Automation and Test in Europe
Dynamic cache management in multi-core architectures through run-time adaptation

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures

ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
An empirical model for predicting cross-core performance interference on multicore processors

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Jigsaw: scalable software-defined caches

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Managing shared last-level cache in a heterogeneous multicore processor

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Impact of resource sharing on performance and performance prediction: a survey

CONCUR'13 Proceedings of the 24th international conference on Concurrency Theory
Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture

Journal of Parallel and Distributed Computing
Insertion and promotion for tree-based PseudoLRU last-level caches

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Imbalanced cache partitioning for balanced data-parallel programs

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Efficient management of last-level caches in graphics processors for 3D scene rendering workloads

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Virtually split cache: An efficient mechanism to distribute instructions and data

ACM Transactions on Architecture and Code Optimization (TACO)
Ubik: efficient cache sharing with strict qos for latency-critical workloads

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
ReSense: Mapping dynamic workloads of colocated multithreaded applications using resource sensitivity

ACM Transactions on Architecture and Code Optimization (TACO)
WADE: Writeback-aware dynamic cache management for NVM-based main memory system

ACM Transactions on Architecture and Code Optimization (TACO)
Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache

Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis
Cache isolation for virtualization of mixed general-purpose and real-time systems

Journal of Systems Architecture: the EUROMICRO Journal
An effectiveness-based adaptive cache replacement policy

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many multi-core processors employ a large last-level cache (LLC) shared among the multiple cores. Past research has demonstrated that sharing-oblivious cache management policies (e.g., LRU) can lead to poor performance and fairness when the multiple cores compete for the limited LLC capacity. Different memory access patterns can cause cache contention in different ways, and various techniques have been proposed to target some of these behaviors. In this work, we propose a new cache management approach that combines dynamic insertion and promotion policies to provide the benefits of cache partitioning, adaptive insertion, and capacity stealing all with a single mechanism. By handling multiple types of memory behaviors, our proposed technique outperforms techniques that target only either capacity partitioning or adaptive insertion.