The case for tiny tasks in compute clusters

Authors:
Kay Ousterhout;Aurojit Panda;Joshua Rosen;Shivaram Venkataraman;Reynold Xin;Sylvia Ratnasamy;Scott Shenker;Ion Stoica
Affiliations:
UC Berkeley;UC Berkeley;UC Berkeley;UC Berkeley;UC Berkeley;UC Berkeley;UC Berkeley and ICSI;UC Berkeley
Venue:
HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
Year:
2013

Citing 31
Cited 4

Experiences with the Amoeba distributed operating system

Communications of the ACM
Transparent process migration: design alternatives and the sprite implementation

Software—Practice & Experience
Trace-driven modeling and analysis of CPU scheduling in a multiprogramming system

Communications of the ACM
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
ReVirt: enabling intrusion analysis through virtual-machine logging and replay

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Live migration of virtual machines

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Scheduling multithreaded computations by work stealing

SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
Quincy: fair scheduling for distributed computing clusters

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

Proceedings of the 5th European conference on Computer systems
Data warehousing and analytics infrastructure at facebook

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Reining in the outliers in map-reduce clusters using Mantri

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Scarlett: coping with skewed content popularity in mapreduce clusters

Proceedings of the sixth conference on Computer systems
Mesos: a platform for fine-grained resource sharing in the data center

Proceedings of the 8th USENIX conference on Networked systems design and implementation
It's time for low latency

HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Managing data transfers in computer clusters with orchestra

Proceedings of the ACM SIGCOMM 2011 conference
Incoop: MapReduce for incremental computations

Proceedings of the 2nd ACM Symposium on Cloud Computing
SkewTune: mitigating skew in mapreduce applications

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Load Balancing in MapReduce Based on Scalable Cardinality Estimates

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads

Proceedings of the VLDB Endowment
Flat datacenter storage

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Coflow: a networking abstraction for cluster applications

Proceedings of the 11th ACM Workshop on Hot Topics in Networks
True elasticity in multi-tenant data-intensive compute clusters

Proceedings of the Third ACM Symposium on Cloud Computing
The tail at scale

Communications of the ACM
Omega: flexible, scalable schedulers for large compute clusters

Proceedings of the 8th ACM European Conference on Computer Systems
Effective straggler mitigation: attack of the clones

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation

Shark: SQL and rich analytics at scale

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
Sparrow: distributed, low latency scheduling

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
GRASS: trimming stragglers in approximation analytics

NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We argue for breaking data-parallel jobs in compute clusters into tiny tasks that each complete in hundreds of milliseconds. Tiny tasks avoid the need for complex skew mitigation techniques: by breaking a large job into millions of tiny tasks, work will be evenly spread over available resources by the scheduler. Furthermore, tiny tasks alleviate long wait times seen in today's clusters for interactive jobs: even large batch jobs can be split into small tasks that finish quickly. We demonstrate a 5.2× improvement in response times due to the use of smaller tasks. In current data-parallel computing frameworks, high task launch overheads and scalability limitations prevent users from running short tasks. Recent research has addressed many of these bottlenecks; we discuss remaining challenges and propose a task execution framework that can efficiently support tiny tasks.