Assisting failure diagnosis through filesystem instrumentation

  • Authors:
  • Liang Huang;Kenny Wong

  • Affiliations:
  • University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada

  • Venue:
  • Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

With increasing software size and complexity, corrective software maintenance has become a challenging process. When a failure is reported, it takes time and expertise for human operators to collect the right information and pinpoint the root cause. Typically, the operators are overloaded with information generated from many system components, and need assistance. In practice, however, failures are often recurrent. If they can be identified accurately, the appropriate fix may already be known from prior collected experience about the system. Our approach to diagnose failures is to look at differences in the state of the filesystem and how files are accessed under normal and abnormal situations. In this research, we monitor the behavior of the system through its file-related calls on an instrumented filesystem. When a failure occurs, these calls are abstracted and classified to identify the likely cause. A diagnostic tool is implemented based on this approach. Through an experiment involving one J2EE Web application, we present the effectiveness of our approach in terms of precision and recall.