Breaking the MapReduce stage barrier

  • Authors:
  • Abhishek Verma;Brian Cho;Nicolas Zea;Indranil Gupta;Roy H. Campbell

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA 61801-2302;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA 61801-2302;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA 61801-2302;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA 61801-2302;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, USA 61801-2302

  • Venue:
  • Cluster Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The MapReduce model uses a barrier between the Map and Reduce stages. This provides simplicity in both programming and implementation. However, in many situations, this barrier hurts performance because it is overly restrictive. Hence, we develop a method to break the barrier in MapReduce in a way that improves efficiency. Careful design of our barrier-less MapReduce framework results in equivalent generality and retains ease of programming. We motivate our case with, and experimentally study our barrier-less techniques in, a wide variety of MapReduce applications divided into seven classes. Our experiments show that our approach can achieve better job completion times than a traditional MapReduce framework. This is due primarily to the interleaving of I/O and computation, and forgoing disk-intensive work. We achieve a reduction in job completion times that is 25% on average and 87% in the best case.