Cloud MapReduce: A MapReduce Implementation on Top of a Cloud Operating System

  • Authors:
  • Huan Liu;Dan Orban

  • Affiliations:
  • -;-

  • Venue:
  • CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Like a traditional Operating System (OS), a cloud OS is responsible for managing the low level cloud resources and presenting a high level interface to the application programmers in order to hide the infrastructure details. However, unlike a traditional OS, a cloud OS has to manage these resources at scale. If a cloud OS has already taken on the complexity to make its services scalable, we should be able to greatly simplify a large-scale system design and implementation if we build on top of it. Unfortunately, a cloud's scale comes at a price. For example, Amazon cloud not only relies on horizontal scaling, but it also adopts a weaker consistency model called eventual consistency. We describe Cloud MapReduce (CMR), which implements the MapReduce programming model on top of the Amazon cloud OS. CMR is a demonstration that it is possible to overcome the cloud limitations and simplify system design and implementation by building on top of a cloud OS. We describe how we overcome the limitations presented by horizontal scaling and the weaker consistency guarantee. Our experimental results show that CMR runs faster than Hadoop, another implementation of MapReduce, and that CMR is a practical system. We believe that the techniques we used are general enough that they can be used to build other systems on top of a cloud OS.