Otus: resource attribution in data-intensive clusters

  • Authors:
  • Kai Ren;Julio López;Garth Gibson

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA

  • Venue:
  • Proceedings of the second international workshop on MapReduce and its applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frameworks for large scale data-intensive applications, such as Hadoop and Dryad, have gained tremendous popularity.Understanding the resource requirements of these frameworks and the performance characteristics of distributed applications is inherently difficult. We present an approach, based on resource attribution, that aims at facilitating performance analyses of distributed data-intensive applications.This approach is embodied in Otus, a monitoring tool to attribute resource usage to jobs and services in Hadoop clusters.Otus collects and correlates performance metrics from distributed components and provides views that display time-series of these metrics filtered and aggregated using multiple criteria.Our evaluation shows that this approach can be deployed without incurring major overheads.Our experience with Otus in a production cluster suggests its effectiveness at helping users and cluster administrators with application performance analysis and troubleshooting.