Does RDMA-based enhanced Hadoop MapReduce need a new performance model?

Authors:
Md. Wasi-ur-Rahman;Xiaoyi Lu;Nusrat S. Islam;Dhabaleswar K. (DK) Panda
Affiliations:
The Ohio State University;The Ohio State University;The Ohio State University;The Ohio State University
Venue:
Proceedings of the 4th annual Symposium on Cloud Computing
Year:
2013

Citing 9
Cited 0

MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
ParaTimer: a progress indicator for MapReduce DAGs

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
MapReduce online

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
A model of computation for MapReduce

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
DOT: a matrix model for analyzing, optimizing and deploying software for big data analytics in distributed systems

Proceedings of the 2nd ACM Symposium on Cloud Computing
Hadoop acceleration through network levitated merge

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
TransMR: data-centric programming beyond data parallelism

HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
High performance RDMA-based design of HDFS over InfiniBand

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
A Practical Performance Model for Hadoop MapReduce

CLUSTERW '12 Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent studies [17, 12] show that leveraging benefits of high performance interconnects like InfiniBand, MapReduce performance in terms of job execution time can be greatly enhanced by using additional features like in-memory merge, pipelined merge and reduce, and prefetching and caching of map outputs. In this paper, we validate that it is time to have a new performance model for the RDMA-based design of MapReduce over high performance interconnects. Our initial results derived from the proposed analytical model matches the experimental results within a 3--5% range.