Topology-Aware quality-of-service support in highly integrated chip multiprocessors

Authors:
Boris Grot;Stephen W. Keckler;Onur Mutlu
Affiliations:
The University of Texas at Austin, Austin, TX;The University of Texas at Austin, Austin, TX;Carnegie Mellon University, Pittsburgh, PA
Venue:
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Year:
2010

Citing 21
Cited 2

Analysis and simulation of a fair queueing algorithm

SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Virtual clock: a new traffic control algorithm for packet switching networks

SIGCOMM '90 Proceedings of the ACM symposium on Communications architectures & protocols
Rotating combined queueing (RCQ): bandwidth and latency guarantees in low-cost, high-performance networks

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
CQoS: a framework for enabling QoS in shared caches of CMP platforms

Proceedings of the 18th annual international conference on Supercomputing
Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip

DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Design tradeoffs for tiled CMP on-chip networks

Proceedings of the 20th annual international conference on Supercomputing
Virtual hierarchies to support server consolidation

Proceedings of the 34th annual international symposium on Computer architecture
Virtual private caches

Proceedings of the 34th annual international symposium on Computer architecture
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Flattened Butterfly Topology for On-Chip Networks

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Globally-Synchronized Frames for Guaranteed Quality-of-Service in On-Chip Networks

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds

Proceedings of the 16th ACM conference on Computer and communications security
Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Application-aware prioritization mechanisms for on-chip networks

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration

Proceedings of the Conference on Design, Automation and Test in Europe

APCR: an adaptive physical channel regulator for on-chip interconnects

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
SurfNoC: a low latency and provably non-interfering approach to secure networks-on-chip

Proceedings of the 40th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Power limitations and complexity constraints demand modular designs, such as chip multiprocessors (CMPs) and systems-on-chip (SOCs). Today's CMPs feature up to a hundred discrete cores, with greater levels of integration anticipated in the future. Supporting effective on-chip resource sharing for cloud computing and server consolidation necessitates CMP-level quality-of-service (QOS) for performance isolation, service guarantees, and security. This work takes a topology-aware approach to on-chip QOS. We propose to segregate shared resources into dedicated, QOS-enabled regions of the chip. We than eliminate QOS-related hardware and its associated overheads from the rest of the die via a combination of topology and operating system support. We evaluate several topologies for the QOS-enabled regions, including a new organization called Destination Partitioned Subnets (DPS) which uses a light-weight dedicated network for each destination node. DPS matches or bests other topologies with comparable bisection bandwidth in performance, area- and energy-efficiency, fairness, and preemption resilience.