MRSG - A MapReduce simulator over SimGrid

Authors:
Wagner Kolberg;Pedro De B. Marcos;Julio C. S. Anjos;Alexandre K. S. Miyazaki;Claudio R. Geyer;Luciana B. Arantes
Affiliations:
Federal University of Rio Grande do Sul (UFRGS), Institute of Informatics - GPPD, Caixa Postal 15.064 - 91.501-970, Porto Alegre, RS, Brazil;Federal University of Rio Grande do Sul (UFRGS), Institute of Informatics - GPPD, Caixa Postal 15.064 - 91.501-970, Porto Alegre, RS, Brazil;Federal University of Rio Grande do Sul (UFRGS), Institute of Informatics - GPPD, Caixa Postal 15.064 - 91.501-970, Porto Alegre, RS, Brazil;Federal University of Rio Grande do Sul (UFRGS), Institute of Informatics - GPPD, Caixa Postal 15.064 - 91.501-970, Porto Alegre, RS, Brazil;Federal University of Rio Grande do Sul (UFRGS), Institute of Informatics - GPPD, Caixa Postal 15.064 - 91.501-970, Porto Alegre, RS, Brazil;Universit Pierre et Marie Curie, CNRS INRIA - REGAL, 4 Place Jussieu, 75005 Paris, France
Venue:
Parallel Computing
Year:
2013

Citing 11
Cited 0

The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
SimGrid: A Generic Framework for Large-Scale Distributed Experiments

UKSIM '08 Proceedings of the Tenth International Conference on Computer Modeling and Simulation
Using realistic simulation for performance analysis of mapreduce setups

Proceedings of the 1st ACM workshop on Large-Scale system and application performance
Improving MapReduce performance in heterogeneous environments

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Towards MapReduce for Desktop Grid Computing

3PGCIC '10 Proceedings of the 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing
MapReduce in the Clouds for Science

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Mars: Accelerating MapReduce with Graphics Processors

IEEE Transactions on Parallel and Distributed Systems
Volunteer Cloud Computing: MapReduce over the Internet

IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Hadoop: The Definitive Guide

Hadoop: The Definitive Guide

Quantified Score

Hi-index	0.00

Visualization

Abstract

MapReduce is a parallel programming model to process large datasets, and it was inspired by the Map and Reduce primitives from functional languages. Its first implementation was designed to run on large clusters of homogeneous machines. Though, in the last years, the model was ported to different types of environments, such as desktop grid and volunteer computing. To obtain a good performance in these environments, however, it is necessary to adapt some framework mechanisms, such as scheduling and data distribution algorithms. In this paper we present the MRSG simulator, which reproduces the MapReduce work-flow on top of the SimGrid simulation toolkit, and provides an API to implement and evaluate these new algorithms and policies for MapReduce. To evaluate the simulator, we compared its behavior against a real Hadoop MapReduce deployment. The results show an important similarity between the simulated and real executions.