Solving the straggler problem with bounded staleness

  • Authors:
  • James Cipar;Qirong Ho;Jin Kyu Kim;Seunghak Lee;Gregory R. Ganger;Garth Gibson;Kimberly Keeton;Eric Xing

  • Affiliations:
  • Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;HP Labs;Carnegie Mellon University

  • Venue:
  • HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many important applications fall into the broad class of iterative convergent algorithms. Parallel implementations of these algorithms are naturally expressed using the Bulk Synchronous Parallel (BSP) model of computation. However, implementations using BSP are plagued by the straggler problem, where every transient slowdown of any given thread can delay all other threads. This paper presents the Stale Synchronous Parallel (SSP) model as a generalization of BSP that preserves many of its advantages, while avoiding the straggler problem. Algorithms using SSP can execute efficiently, even with significant delays in some threads, addressing the oft-faced straggler problem.