MOON: MapReduce On Opportunistic eNvironments

  • Authors:
  • Heshan Lin;Xiaosong Ma;Jeremy Archuleta;Wu-chun Feng;Mark Gardner;Zhe Zhang

  • Affiliations:
  • Virginia Tech;North Carolina State U.;Virginia Tech;Virginia Tech;Virginia Tech;Oak Ridge Nat'l Lab

  • Venue:
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for distributed volunteer computing systems. However, unlike on dedicated resources, where MapReduce has mostly been deployed, such volunteer computing systems have significantly higher rates of node unavailability. Furthermore, nodes are not fully controlled by the MapReduce framework. Consequently, we found the data and task replication scheme adopted by existing MapReduce implementations woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. Our tests on an emulated volunteer computing system, which uses a 60-node cluster where each node possesses a similar hardware configuration to a typical computer in a student lab, demonstrate that MOON can deliver a three-fold performance improvement to Hadoop in volatile, volunteer computing environments.