Modeling the Performance of the Hadoop Online Prototype

  • Authors:
  • Emanuel Vianna;Giovanni Comarela;Tatiana Pontes;Jussara Almeida;Virgilio Almeida;Kevin Wilkinson;Harumi Kuno;Umeshwar Dayal

  • Affiliations:
  • -;-;-;-;-;-;-;-

  • Venue:
  • SBAC-PAD '11 Proceedings of the 2011 23rd International Symposium on Computer Architecture and High Performance Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce is an important paradigm to support modern data-intensive applications. In this paper we address the challenge of modeling performance of one implementation of MapReduce called Hadoop Online Prototype (HOP), with a specific target on the intra-job pipeline parallelism. We use a hierarchical model that combines a precedence model and a queuing network model to capture the intra-job synchronization constraints. We first show how to build a precedence graph that represents the dependencies among multiple tasks of the same job. We then apply it jointly with an approximate Mean Value Analysis (aMVA) solution to predict mean job response time and resource utilization. We validate our solution against a queuing network simulator in various scenarios, finding that our performance model presents a close agreement, with maximum relative difference under 15%.