Dynamic proportional share scheduling in Hadoop

  • Authors:
  • Thomas Sandholm;Kevin Lai

  • Affiliations:
  • Social Computing Lab, Hewlett-Packard Labs, Palo Alto, CA;Social Computing Lab, Hewlett-Packard Labs, Palo Alto, CA

  • Venue:
  • JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present the Dynamic Priority (DP) parallel task scheduler for Hadoop. It allows users to control their allocated capacity by adjusting their spending over time. This simple mechanism allows the scheduler to make more efficient decisions about which jobs and users to prioritize and gives users the tool to optimize and customize their allocations to fit the importance and requirements of their jobs. Additionally, it gives users the incentive to scale back their jobs when demand is high, since the cost of running on a slot is then also more expensive. We envision our scheduler to be used by deadline or budget optimizing agents on behalf of users. We describe the design and implementation of the DP scheduler and experimental results. We show that our scheduler enforces service levels more accurately and also scales to more users with distinct service levels than existing schedulers.