Modeling I/O interference for data intensive distributed applications

  • Authors:
  • Sven Groot;Kazuo Goda;Daisaku Yokoyama;Miyuki Nakano;Masaru Kitsuregawa

  • Affiliations:
  • The University of Tokyo, Komaba, Meguro-ku, Tokyo, Japan;The University of Tokyo, Komaba, Meguro-ku, Tokyo, Japan;The University of Tokyo, Komaba, Meguro-ku, Tokyo, Japan;The University of Tokyo, Komaba, Meguro-ku, Tokyo, Japan;The University of Tokyo, Komaba, Meguro-ku, Tokyo, Japan

  • Venue:
  • Proceedings of the 28th Annual ACM Symposium on Applied Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data intensive applications such as MapReduce can have large performance degradation from the effects of I/O interference when multiple processes access the same I/O resources simultaneously, particularly in the case of disks. It is necessary to understand this effect in order to improve resource allocation and utilization for these applications. In this paper, we propose a model for predicting the impact of I/O interference on MapReduce application performance. Our model takes basic parameters of the workload and hardware environment, and knowledge of the I/O behavior of the application to predict how I/O interference affects the scalability of an application. We compare the model's predictions for several workloads (TeraSort, WordCount, PFP Growth and PageRank) against the actual behavior of those workloads in a real cluster environment, and confirm that our model can provide highly accurate predictions.