The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Resource containers: a new facility for resource management in server systems
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
On deciding stability of scheduling policies in queueing systems
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Managing energy and server resources in hosting centers
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Methods and metrics for cold-start recommendations
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Stability, capacity, and scheduling of multiclass queueing networks
Stability, capacity, and scheduling of multiclass queueing networks
Introduction to Modern Cryptography (Chapman & Hall/Crc Cryptography and Network Security Series)
Introduction to Modern Cryptography (Chapman & Hall/Crc Cryptography and Network Security Series)
Exploiting Platform Heterogeneity for Power Efficient Data Centers
ICAC '07 Proceedings of the Fourth International Conference on Autonomic Computing
Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Modern Operating Systems
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Exploiting Item Taxonomy for Solving Cold-Start Problem in Recommendation Making
ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Workload Analysis and Demand Prediction of Enterprise Data Center Applications
IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
HASS: a scheduler for heterogeneous multicore systems
ACM SIGOPS Operating Systems Review
Internet-scale service infrastructure efficiency
Proceedings of the 36th annual international symposium on Computer architecture
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
On the energy (in)efficiency of Hadoop clusters
ACM SIGOPS Operating Systems Review
Q-clouds: managing performance interference effects for QoS-aware clouds
Proceedings of the 5th European conference on Computer systems
Mesos: a platform for fine-grained resource sharing in the data center
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Dominant resource fairness: fair allocation of multiple resource types
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Data Mining: Practical Machine Learning Tools and Techniques
Data Mining: Practical Machine Learning Tools and Techniques
Vantage: scalable and efficient fine-grain cache partitioning
Proceedings of the 38th annual international symposium on Computer architecture
Power management of online data-intensive services
Proceedings of the 38th annual international symposium on Computer architecture
Warehouse-Scale Computing: Entering the Teenage Decade
Proceedings of the 38th annual international symposium on Computer architecture
CloudScale: elastic resource scaling for multi-tenant cloud systems
Proceedings of the 2nd ACM Symposium on Cloud Computing
Proceedings of the 2nd ACM Symposium on Cloud Computing
Windows Azure Storage: a highly available cloud storage service with strong consistency
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity
IEEE Computer Architecture Letters
DejaVu: accelerating resource allocation in virtualized environments
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Large-scale machine learning at twitter
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)
Proceedings of the 39th Annual International Symposium on Computer Architecture
Paragon: QoS-aware scheduling for heterogeneous datacenters
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Omega: flexible, scalable schedulers for large compute clusters
Proceedings of the 8th ACM European Conference on Computer Systems
CPI2: CPU performance isolation for shared compute clusters
Proceedings of the 8th ACM European Conference on Computer Systems
Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers
Proceedings of the 40th Annual International Symposium on Computer Architecture
Whare-map: heterogeneity in "homogeneous" warehouse-scale computers
Proceedings of the 40th Annual International Symposium on Computer Architecture
The Netflix Challenge: Datacenter Edition
IEEE Computer Architecture Letters
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Hi-index | 0.00 |
Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, violating the quality of service (QoS) guarantees that many cloud workloads require. While previous work has identified the impact of heterogeneity and interference, existing solutions are computationally intensive, cannot be applied online, and do not scale beyond a few applications. We present Paragon, an online and scalable DC scheduler that is heterogeneity- and interference-aware. Paragon is derived from robust analytical methods, and instead of profiling each application in detail, it leverages information the system already has about applications it has previously seen. It uses collaborative filtering techniques to quickly and accurately classify an unknown incoming workload with respect to heterogeneity and interference in multiple shared resources. It does so by identifying similarities to previously scheduled applications. The classification allows Paragon to greedily schedule applications in a manner that minimizes interference and maximizes server utilization. After the initial application placement, Paragon monitors application behavior and adjusts the scheduling decisions at runtime to avoid performance degradations. Additionally, we design ARQ, a multiclass admission control protocol that constrains application waiting time. ARQ queues applications in separate classes based on the type of resources they need and avoids long queueing delays for easy-to-satisfy workloads in highly-loaded scenarios. Paragon scales to tens of thousands of servers and applications with marginal scheduling overheads in terms of time or state. We evaluate Paragon with a wide range of workload scenarios, on both small and large-scale systems, including 1,000 servers on EC2. For a 2,500-workload scenario, Paragon enforces performance guarantees for 91% of applications, while significantly improving utilization. In comparison, heterogeneity-oblivious, interference-oblivious, and least-loaded schedulers only provide similar guarantees for 14%, 11%, and 3% of workloads. The differences are more striking in oversubscribed scenarios where resource efficiency is more critical.