MRBS: towards dependability benchmarking for hadoop mapreduce

  • Authors:
  • Amit Sangroya;Damián Serrano;Sara Bouchenak

  • Affiliations:
  • University of Grenoble - LIG - INRIA, Grenoble, France;University of Grenoble - LIG - INRIA, Grenoble, France;University of Grenoble - LIG - INRIA, Grenoble, France

  • Venue:
  • Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce is a popular programming model for distributed data processing. Extensive research has been conducted on the reliability of MapReduce, ranging from adaptive and on-demand fault-tolerance to new fault-tolerance models. However, realistic benchmarks are still missing to analyze and compare the effectiveness of these proposals. To date, most MapReduce fault-tolerance solutions have been evaluated using microbenchmarks in an ad-hoc and overly simplified setting, which may not be representative of real-world applications. This paper presents MRBS, a comprehensive benchmark suite for evaluating the dependability of MapReduce systems. MRBS includes five benchmarks covering several application domains and a wide range of execution scenarios such as data-intensive vs. compute-intensive applications, or batch applications vs. online interactive applications. MRBS allows to inject various types of faults at different rates and produces extensive reliability, availability and performance statistics. The paper illustrates the use of MRBS with Hadoop clusters.