HiTune: dataflow-based performance analysis for big data cloud

  • Authors:
  • Jinquan Dai;Jie Huang;Shengsheng Huang;Bo Huang;Yan Liu

  • Affiliations:
  • Intel Asia-Pacific Research and Development Ltd, Shanghai, P.R.China;Intel Asia-Pacific Research and Development Ltd, Shanghai, P.R.China;Intel Asia-Pacific Research and Development Ltd, Shanghai, P.R.China;Intel Asia-Pacific Research and Development Ltd, Shanghai, P.R.China;Intel Asia-Pacific Research and Development Ltd, Shanghai, P.R.China

  • Venue:
  • HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although Big Data Cloud (e.g., MapReduce, Hadoop and Dryad) makes it easy to develop and run highly scalable applications, efficient provisioning and fine-tuning of these massively distributed systems remain a major challenge. In this paper, we describe a general approach to help address this challenge, based on distributed instrumentations and dataflow-driven performance analysis. Based on this approach, we have implemented HiTune, a scalable, lightweight and extensible performance analyzer for Hadoop. We report our experience on how HiTune helps users to efficiently conduct Hadoop performance analysis and tuning, demonstrating the benefits of dataflow-based analysis and the limitations of existing approaches (e.g., system statistics, Hadoop logs and metrics, and traditional profiling).