Design and implementation of a customizable work stealing scheduler

  • Authors:
  • Jun Nakashima;Sho Nakatani;Kenjiro Taura

  • Affiliations:
  • The University of Tokyo;The University of Tokyo;The University of Tokyo

  • Venue:
  • Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

An efficient scheduler is important for task parallelism. It should provide scalable dynamic load-balancing mechanism among CPU cores. To meet this requirement, most runtime systems for task parallelism use work stealing as scheduling strategy. Work stealing schedulers typically steal work randomly. This strategy does not consider hardware specific knowledge such as memory hierarchy or application specific knowledge such as cache usage. In order to execute tasks more efficiently, work stealing schedulers should take such knowledge into account. To this end, we propose an API that can customize scheduling strategies and take hardware and application specific knowledge into account while preserving the desirable properties of work stealing. This paper describes the design of our proposed API. Specifically, it provides mechanisms to give scheduling hints for tasks and to implement user-defined work stealing functions. They enable programmers to implement a work stealing strategy optimized for their applications. This paper also presents preliminary evaluation results of the proposed API. A kernel of STREAM microbenchmark improved by 58.8% with a work stealing strategy utilizing data cached by the previous iteration. Performance of matrix multiply improved by 18.2% on 32 AMD cores by a work stealing strategy that tries to steal as a coarse grained task as possible.