Utilization-Based Admission Control for Real-Time Applications
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
ISADS '01 Proceedings of the Fifth International Symposium on Autonomous Decentralized Systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Value-maximizing deadline scheduling and its application to animation rendering
Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Dynamic Provisioning of Multi-tier Internet Applications
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
A statistical approach to risk mitigation in computational markets
Proceedings of the 16th international symposium on High performance distributed computing
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Online Optimization for Latency Assignment in Distributed Real-Time Systems
ICDCS '08 Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
Validity of the single processor approach to achieving large scale computing capabilities
AFIPS '67 (Spring) Proceedings of the April 18-20, 1967, spring joint computer conference
MapReduce optimization using regulated dynamic prioritization
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
Proceedings of the 5th European conference on Computer systems
Prediction-based enforcement of performance contracts
GECON'07 Proceedings of the 4th international conference on Grid economics and business models
FlumeJava: easy, efficient data-parallel pipelines
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
ParaTimer: a progress indicator for MapReduce DAGs
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
The impact of virtualization on network performance of amazon EC2 data center
INFOCOM'10 Proceedings of the 29th conference on Information communications
CloudCmp: comparing public cloud providers
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Reining in the outliers in map-reduce clusters using Mantri
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Scarlett: coping with skewed content popularity in mapreduce clusters
Proceedings of the sixth conference on Computer systems
CIEL: a universal execution engine for distributed data-flow computing
Proceedings of the 8th USENIX conference on Networked systems design and implementation
Apache hadoop goes realtime at Facebook
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics
Proceedings of the 2nd ACM Symposium on Cloud Computing
Optimizing data shuffling in data-parallel computation by understanding user-defined functions
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Automated diagnosis without predictability is a recipe for failure
HotCloud'12 Proceedings of the 4th USENIX conference on Hot Topics in Cloud Ccomputing
Bridging the tenant-provider gap in cloud services
Proceedings of the Third ACM Symposium on Cloud Computing
Cake: enabling high-level SLOs on shared storage systems
Proceedings of the Third ACM Symposium on Cloud Computing
alsched: algebraic scheduling of mixed workloads in heterogeneous clouds
Proceedings of the Third ACM Symposium on Cloud Computing
CloudPack* exploiting workload flexibility through rational pricing
Proceedings of the 13th International Middleware Conference
Building and scaling virtual clusters with residual resources from interactive clouds
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Omega: flexible, scalable schedulers for large compute clusters
Proceedings of the 8th ACM European Conference on Computer Systems
Speeding up distributed request-response workflows
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Efficient online scheduling for deadline-sensitive jobs: extended abstract
Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
Proceedings of the 4th annual Symposium on Cloud Computing
Agile middleware for scheduling: meeting competing performance requirements of diverse tasks
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Hi-index | 0.00 |
Data processing frameworks such as MapReduce [8] and Dryad [11] are used today in business environments where customers expect guaranteed performance. To date, however, these systems are not capable of providing guarantees on job latency because scheduling policies are based on fair-sharing, and operators seek high cluster use through statistical multiplexing and over-subscription. With Jockey, we provide latency SLOs for data parallel jobs written in SCOPE. Jockey precomputes statistics using a simulator that captures the job's complex internal dependencies, accurately and efficiently predicting the remaining run time at different resource allocations and in different stages of the job. Our control policy monitors a job's performance, and dynamically adjusts resource allocation in the shared cluster in order to maximize the job's economic utility while minimizing its impact on the rest of the cluster. In our experiments in Microsoft's production Cosmos clusters, Jockey meets the specified job latency SLOs and responds to changes in cluster conditions.