SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
SimGrid: A Generic Framework for Large-Scale Distributed Experiments
UKSIM '08 Proceedings of the Tenth International Conference on Computer Modeling and Simulation
Using realistic simulation for performance analysis of mapreduce setups
Proceedings of the 1st ACM workshop on Large-Scale system and application performance
Improving MapReduce performance in heterogeneous environments
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Towards MapReduce for Desktop Grid Computing
3PGCIC '10 Proceedings of the 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing
MapReduce in the Clouds for Science
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Mars: Accelerating MapReduce with Graphics Processors
IEEE Transactions on Parallel and Distributed Systems
Volunteer Cloud Computing: MapReduce over the Internet
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Hadoop: The Definitive Guide
Hi-index | 0.00 |
MapReduce is a parallel programming model to process large datasets, and it was inspired by the Map and Reduce primitives from functional languages. Its first implementation was designed to run on large clusters of homogeneous machines. Though, in the last years, the model was ported to different types of environments, such as desktop grid and volunteer computing. To obtain a good performance in these environments, however, it is necessary to adapt some framework mechanisms, such as scheduling and data distribution algorithms. In this paper we present the MRSG simulator, which reproduces the MapReduce work-flow on top of the SimGrid simulation toolkit, and provides an API to implement and evaluate these new algorithms and policies for MapReduce. To evaluate the simulator, we compared its behavior against a real Hadoop MapReduce deployment. The results show an important similarity between the simulated and real executions.