MR-scope: a real-time tracing tool for MapReduce

  • Authors:
  • Dachuan Huang;Xuanhua Shi;Shadi Ibrahim;Lu Lu;Hongzhang Liu;Song Wu;Hai Jin

  • Affiliations:
  • Huazhong University of Science and Technology, Wuhan, China;Huazhong University of Science and Technology, Wuhan, China;Huazhong University of Science and Technology, Wuhan, China;Huazhong University of Science and Technology, Wuhan, China;Huazhong University of Science and Technology, Wuhan, China;Huazhong University of Science and Technology, Wuhan, China;Huazhong University of Science and Technology, Wuhan, China

  • Venue:
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

MapReduce programming model is emerging as an efficient tool for data-intensive applications. Hadoop, an open-source implementation of MapReduce, has been widely adopted and experienced by both academia and enterprise. Recently, lots of efforts have been done on improving the performance of MapReduce system and on analyzing the MapReduce process based on the log files generated during the Hadoop execution. Visualizing log files seems to be a very useful tool to understand the behavior of the Hadoop process. In this paper, we present MR-Scope, a real-time MapReduce tracing tool. MR-Scope provides a real-time insight of the MapReduce process, including the ongoing progress of every task hosted in Task Tracker. In addition, it displays the health of the Hadoop cluster data nodes, the distribution of the file system's blocks and their replicas and the content of the different block splits of the file system. We implement MR-Scope in native Hadoop 0.1. Experimental results demonstrat that MR-Scope's overhead is less than 4% when running wordcount benchmark.