Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

  • Authors:
  • Matei Zaharia;Dhruba Borthakur;Joydeep Sen Sarma;Khaled Elmeleegy;Scott Shenker;Ion Stoica

  • Affiliations:
  • University of California, Berkeley, Berkeley, CA, USA;Facebook Inc, Palo Alto, CA, USA;Facebook Inc, Palo Alto, CA, USA;Yahoo! Research, Sunnyvale, CA, USA;University of California, Berkeley, Berkeley, CA, USA;University of California, Berkeley, Berkeley, CA, USA

  • Venue:
  • Proceedings of the 5th European conference on Computer systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

As organizations start to use data-intensive cluster computing systems like Hadoop and Dryad for more applications, there is a growing need to share clusters between users. However, there is a conflict between fairness in scheduling and data locality (placing tasks on nodes that contain their input data). We illustrate this problem through our experience designing a fair scheduler for a 600-node Hadoop cluster at Facebook. To address the conflict between locality and fairness, we propose a simple algorithm called delay scheduling: when the job that should be scheduled next according to fairness cannot launch a local task, it waits for a small amount of time, letting other jobs launch tasks instead. We find that delay scheduling achieves nearly optimal data locality in a variety of workloads and can increase throughput by up to 2x while preserving fairness. In addition, the simplicity of delay scheduling makes it applicable under a wide variety of scheduling policies beyond fair sharing.