Understanding and improving the diagnostic workflow of MapReduce users

  • Authors:
  • Jason D. Campbell;Arun B. Ganesan;Ben Gotow;Soila P. Kavulya;James Mulholland;Priya Narasimhan;Sriram Ramasubramanian;Mark Shuster;Jiaqi Tan

  • Affiliations:
  • Intel Labs Pittsburgh;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;DSO National Laboratories, Singapore

  • Venue:
  • CHIMIT '11 Proceedings of the 5th ACM Symposium on Computer Human Interaction for Management of Information Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

New abstractions are simplifying the programming of large clusters, but diagnosis nontheless gets more and more challenging as cluster sizes grow: Debugging information increases linearly with cluster size, and the count of intercomponent relationships grows quadratically. Worse, the new abstractions which simplified programming can also obscure the relationships between high-level (application) and low-level (task/process/disk/CPU) information flows. In this paper we analyze the workflow of several users and systems administrators connected with a large academic cluster (based the popular Hadoop implementation of the MapReduce abstraction) and propose improvements to the diagnosis-relevant information displays. We also offer a preliminary analysis of the efficacy of the changes we propose that demonstrates a 40% reduction in the time taken to accomplish 5 representative diagnostic tasks as compared to the current system.