New ideas track: testing mapreduce-style programs

  • Authors:
  • Christoph Csallner;Leonidas Fegaras;Chengkai Li

  • Affiliations:
  • University of Texas at Arlington, Arlington, TX, USA;University of Texas at Arlington, Arlington, TX, USA;University of Texas at Arlington, Arlington, TX, USA

  • Venue:
  • Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

MapReduce has become a common programming model for processing very large amounts of data, which is needed in a spectrum of modern computing applications. Today several MapReduce implementations and execution systems exist and many MapReduce programs are being developed and deployed in practice. However, developing MapReduce programs is not always an easy task. The programming model makes programs prone to several MapReduce-specific bugs. That is, to produce deterministic results, a MapReduce program needs to satisfy certain high-level correctness conditions. A violating program may yield different output values on the same input data, based on low-level infrastructure events such as network latency, scheduling decisions, etc. Current MapReduce systems and tools are lacking in support for checking these conditions and reporting violations. This paper presents a novel technique that systematically searches for such bugs in MapReduce applications and generates corresponding test cases. The technique works by encoding the high-level MapReduce correctness conditions as symbolic program constraints and checking them for the program under test. To the best of our knowledge, this is the first approach to addressing this problem of MapReduce-style programming.