Managing data transfers in computer clusters with orchestra

  • Authors:
  • Mosharaf Chowdhury;Matei Zaharia;Justin Ma;Michael I. Jordan;Ion Stoica

  • Affiliations:
  • University of California, Berkeley, Berkeley, CA, USA;University of California, Berkeley, Berkeley, CA, USA;University of California, Berkeley, Berkeley, CA, USA;University of California, Berkeley, Berkeley, CA, USA;University of California, Berkeley, Berkeley, CA, USA

  • Venue:
  • Proceedings of the ACM SIGCOMM 2011 conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cluster computing applications like MapReduce and Dryad transfer massive amounts of data between their computation stages. These transfers can have a significant impact on job performance, accounting for more than 50% of job completion times. Despite this impact, there has been relatively little work on optimizing the performance of these data transfers, with networking researchers traditionally focusing on per-flow traffic management. We address this limitation by proposing a global management architecture and a set of algorithms that (1) improve the transfer times of common communication patterns, such as broadcast and shuffle, and (2) allow scheduling policies at the transfer level, such as prioritizing a transfer over other transfers. Using a prototype implementation, we show that our solution improves broadcast completion times by up to 4.5X compared to the status quo in Hadoop. We also show that transfer-level scheduling can reduce the completion time of high-priority transfers by 1.7X.