Towards practical page coloring-based multicore cache management

Authors:
Xiao Zhang;Sandhya Dwarkadas;Kai Shen
Affiliations:
University of Rochester, Rochester, NY, USA;University of Rochester, Rochester, NY, USA;University of Rochester, Rochester, NY, USA
Venue:
Proceedings of the 4th ACM European conference on Computer systems
Year:
2009

Citing 21
Cited 28

Page placement algorithms for large real-indexed caches

ACM Transactions on Computer Systems (TOCS)
Optimal Partitioning of Cache Memory

IEEE Transactions on Computers
Compiler-directed page coloring for multiprocessors

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Reducing cache misses using hardware and software page placement

ICS '99 Proceedings of the 13th international conference on Supercomputing
The TLB slice—a low-cost high-speed address translation mechanism

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Application-specific memory management for embedded systems using software-controlled caches

Proceedings of the 37th Annual Design Automation Conference
LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies

IEEE Transactions on Computers
Latency lags bandwith

Communications of the ACM - Voting systems
Dynamic tracking of page miss ratio curve for memory management

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Memory resource management in VMware ESX server

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Architectural support for operating system-driven CMP cache management

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Managing Distributed, Shared L2 Caches through OS-Level Page Allocation

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
QoS policies and architecture for cache/memory in CMP platforms

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
CacheScouts: Fine-Grain Monitoring of Shared Caches in CMP Platforms

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Processor hardware counter statistics as a first-class system resource

HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
Virtual machine memory access tracing with hypervisor exclusive cache

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture

Request behavior variations

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Addressing shared resource contention in multicore processors via scheduling

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Q-clouds: managing performance interference effects for QoS-aware clouds

Proceedings of the 5th European conference on Computer systems
Hardware execution throttling for multi-core resource management

USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
An evaluation of per-chip nonuniform frequency scaling on multicores

USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Contention-Aware Scheduling on Multicore Systems

ACM Transactions on Computer Systems (TOCS)
All-window profiling and composable models of cache sharing

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
ULCC: a user-level facility for optimizing shared cache performance on multicores

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
SRM-buffer: an OS buffer management technique to prevent last level cache from thrashing in multicores

Proceedings of the sixth conference on Computer systems
Contentiousness vs. sensitivity: improving contention aware runtime systems on multicore architectures

Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era
Low cost working set size tracking

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
FACT: a framework for adaptive contention-aware thread migrations

Proceedings of the 8th ACM International Conference on Computing Frontiers
Page coloring synchronization for improving cache performance in virtualization environment

ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part III
MyUT: Design and implementation of efficient user-level thread management for improving cache utilization

ICCSA'11 Proceedings of the 2011 international conference on Computational science and Its applications - Volume Part V
W-Order scan: minimizing cache pollution by application software level cache management for MMDB

WAIM'11 Proceedings of the 12th international conference on Web-age information management
Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines

Proceedings of the 2nd ACM Symposium on Cloud Computing
CRUISE: cache replacement and utility-aware scheduling

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Code-based cache partitioning for improving hardware cache performance

Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Characterising the performance of cache-aware placement of virtual machines on a multi-core architecture

International Journal of Ad Hoc and Ubiquitous Computing
Survey of scheduling techniques for addressing shared resources in multicore processors

ACM Computing Surveys (CSUR)
To hardware prefetch or not to prefetch?: a virtualized environment study and core binding approach

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
An empirical model for predicting cross-core performance interference on multicore processors

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Coloring the cloud for predictable performance

Proceedings of the 4th annual Symposium on Cloud Computing
Impact of resource sharing on performance and performance prediction: a survey

CONCUR'13 Proceedings of the 24th international conference on Concurrency Theory
ReSense: Mapping dynamic workloads of colocated multithreaded applications using resource sensitivity

ACM Transactions on Architecture and Code Optimization (TACO)
On modeling contention for shared caches in multi-core processors with techniques from ecology

Natural Computing: an international journal
Cache isolation for virtualization of mixed general-purpose and real-time systems

Journal of Systems Architecture: the EUROMICRO Journal
Virtual machine consolidation based on interference modeling

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern multi-core processors present new resource management challenges due to the subtle interactions of simultaneously executing processes sharing on-chip resources (particularly the L2 cache). Recent research demonstrates that the operating system may use the page coloring mechanism to control cache partitioning, and consequently to achieve fair and efficient cache utilization. However, page coloring places additional constraints on memory space allocation, which may conflict with application memory needs. Further, adaptive adjustments of cache partitioning policies in a multi-programmed execution environment may incur substantial overhead for page recoloring (or copying). This paper proposes a hot-page coloring approach enforcing coloring on only a small set of frequently accessed (or hot) pages for each process. The cost of identifying hot pages online is reduced by leveraging the knowledge of spatial locality during a page table scan of access bits. Our results demonstrate that hot page identification and selective coloring can significantly alleviate the coloring-induced adverse effects in practice. However, we also reach the somewhat negative conclusion that without additional hardware support, adaptive page coloring is only beneficial when recoloring is performed infrequently (meaning long scheduling time quanta in multi-programmed executions).