Analysis and design insights for an E-finance platform using parallel processing
ACA'12 Proceedings of the 11th international conference on Applications of Electrical and Computer Engineering
P2P-MapReduce: Parallel data processing in dynamic Cloud environments
Journal of Computer and System Sciences
VMR: volunteer MapReduce over the large scale internet
Proceedings of the 10th International Workshop on Middleware for Grids, Clouds and e-Science
Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
MRSG - A MapReduce simulator over SimGrid
Parallel Computing
Hi-index | 0.01 |
MapReduce is an emerging programming model for data-intense application proposed by Google, which has attracted a lot of attention recently. MapReduce borrows from functional programming, where programmer defines Map and Reduce tasks executed on large set of distributed data. In this paper we propose an implementation of the MapReduce programming model. We present the architecture of the prototype based on Bit Dew, a middleware for large scale data management on Desktop Grid. We describe the set of features which makes our approach suitable for large scale and loosely connected Internet Desktop Grid: massive fault tolerance, replica management, barriers-free execution, latency-hiding optimisation as well as distributed result checking. We also present performance evaluation of the prototype both against micro-benchmarks and real MapReduce application. The scalability test shows that we achieve linear speedup on the classical Word Count benchmark. Several scenarios involving lagger hosts and host crashes demonstrate that the prototype is able to cope with an experimental context similar to real-world Internet.