Work-stealing without the baggage

  • Authors:
  • Vivek Kumar;Daniel Frampton;Stephen M. Blackburn;David Grove;Olivier Tardieu

  • Affiliations:
  • Australian National University, Canberra, Australia;Microsoft, Seattle, WA, USA;Australian National University, Canberra, Australia;IBM T.J. Watson Research, New York, NY, USA;IBM T.J. Watson Research, New York, NY, USA

  • Venue:
  • Proceedings of the ACM international conference on Object oriented programming systems languages and applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Work-stealing is a promising approach for effectively exploiting software parallelism on parallel hardware. A programmer who uses work-stealing explicitly identifies potential parallelism and the runtime then schedules work, keeping otherwise idle hardware busy while relieving overloaded hardware of its burden. Prior work has demonstrated that work-stealing is very effective in practice. However, work-stealing comes with a substantial overhead: as much as 2x to 12x slowdown over orthodox sequential code. In this paper we identify the key sources of overhead in work-stealing schedulers and present two significant refinements to their implementation. We evaluate our work-stealing designs using a range of benchmarks, four different work-stealing implementations, including the popular fork-join framework, and a range of architectures. On these benchmarks, compared to orthodox sequential Java, our fastest design has an overhead of just 15%. By contrast, fork-join has a 2.3x overhead and the previous implementation of the system we use has an overhead of 4.1x. These results and our insight into the sources of overhead for work-stealing implementations give further hope to an already promising technique for exploiting increasingly available hardware parallelism.