QoS-Aware scheduling in heterogeneous datacenters with paragon

Authors:
Christina Delimitrou;Christos Kozyrakis
Affiliations:
Stanford University;Stanford University
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
2013

Citing 42
Cited 0

The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Resource containers: a new facility for resource management in server systems

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
On deciding stability of scheduling policies in queueing systems

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Managing energy and server resources in hosting centers

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Methods and metrics for cold-start recommendations

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Stability, capacity, and scheduling of multiclass queueing networks

Stability, capacity, and scheduling of multiclass queueing networks
IPC Considered Harmful for Multiprocessor Workloads

IEEE Micro
SimFlex: Statistical Sampling of Computer System Simulation

IEEE Micro
Introduction to Modern Cryptography (Chapman & Hall/Crc Cryptography and Network Security Series)

Introduction to Modern Cryptography (Chapman & Hall/Crc Cryptography and Network Security Series)
Exploiting Platform Heterogeneity for Power Efficient Data Centers

ICAC '07 Proceedings of the Fourth International Conference on Autonomic Computing
Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler

PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Modern Operating Systems

Modern Operating Systems
The PARSEC benchmark suite: characterization and architectural implications

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Exploiting Item Taxonomy for Solving Cold-Start Problem in Recommendation Making

ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
1000 islands: an integrated approach to resource management for virtualized data centers

Cluster Computing
Workload Analysis and Demand Prediction of Enterprise Data Center Applications

IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
HASS: a scheduler for heterogeneous multicore systems

ACM SIGOPS Operating Systems Review
Internet-scale service infrastructure efficiency

Proceedings of the 36th annual international symposium on Computer architecture
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
On the energy (in)efficiency of Hadoop clusters

ACM SIGOPS Operating Systems Review
Q-clouds: managing performance interference effects for QoS-aware clouds

Proceedings of the 5th European conference on Computer systems
Server Engineering Insights for Large-Scale Online Services

IEEE Micro
Mesos: a platform for fine-grained resource sharing in the data center

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Dominant resource fairness: fair allocation of multiple resource types

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Data Mining: Practical Machine Learning Tools and Techniques

Data Mining: Practical Machine Learning Tools and Techniques
Vantage: scalable and efficient fine-grain cache partitioning

Proceedings of the 38th annual international symposium on Computer architecture
Power management of online data-intensive services

Proceedings of the 38th annual international symposium on Computer architecture
Warehouse-Scale Computing: Entering the Teenage Decade

Proceedings of the 38th annual international symposium on Computer architecture
CloudScale: elastic resource scaling for multi-tenant cloud systems

Proceedings of the 2nd ACM Symposium on Cloud Computing
Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines

Proceedings of the 2nd ACM Symposium on Cloud Computing
Windows Azure Storage: a highly available cloud storage service with strong consistency

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity

IEEE Computer Architecture Letters
DejaVu: accelerating resource allocation in virtualized environments

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Large-scale machine learning at twitter

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)

Proceedings of the 39th Annual International Symposium on Computer Architecture
Paragon: QoS-aware scheduling for heterogeneous datacenters

Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Omega: flexible, scalable schedulers for large compute clusters

Proceedings of the 8th ACM European Conference on Computer Systems
CPI2: CPU performance isolation for shared compute clusters

Proceedings of the 8th ACM European Conference on Computer Systems
Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers

Proceedings of the 40th Annual International Symposium on Computer Architecture
Whare-map: heterogeneity in "homogeneous" warehouse-scale computers

Proceedings of the 40th Annual International Symposium on Computer Architecture
The Netflix Challenge: Datacenter Edition

IEEE Computer Architecture Letters
DeepDive: transparently identifying and managing performance interference in virtualized environments

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, violating the quality of service (QoS) guarantees that many cloud workloads require. While previous work has identified the impact of heterogeneity and interference, existing solutions are computationally intensive, cannot be applied online, and do not scale beyond a few applications. We present Paragon, an online and scalable DC scheduler that is heterogeneity- and interference-aware. Paragon is derived from robust analytical methods, and instead of profiling each application in detail, it leverages information the system already has about applications it has previously seen. It uses collaborative filtering techniques to quickly and accurately classify an unknown incoming workload with respect to heterogeneity and interference in multiple shared resources. It does so by identifying similarities to previously scheduled applications. The classification allows Paragon to greedily schedule applications in a manner that minimizes interference and maximizes server utilization. After the initial application placement, Paragon monitors application behavior and adjusts the scheduling decisions at runtime to avoid performance degradations. Additionally, we design ARQ, a multiclass admission control protocol that constrains application waiting time. ARQ queues applications in separate classes based on the type of resources they need and avoids long queueing delays for easy-to-satisfy workloads in highly-loaded scenarios. Paragon scales to tens of thousands of servers and applications with marginal scheduling overheads in terms of time or state. We evaluate Paragon with a wide range of workload scenarios, on both small and large-scale systems, including 1,000 servers on EC2. For a 2,500-workload scenario, Paragon enforces performance guarantees for 91% of applications, while significantly improving utilization. In comparison, heterogeneity-oblivious, interference-oblivious, and least-loaded schedulers only provide similar guarantees for 14%, 11%, and 3% of workloads. The differences are more striking in oversubscribed scenarios where resource efficiency is more critical.