Context-based online configuration-error detection

  • Authors:
  • Ding Yuan;Yinglian Xie;Rina Panigrahy;Junfeng Yang;Chad Verbowski;Arunvijay Kumar

  • Affiliations:
  • University of Illinois at Urbana-Champaign and University of California, San Diego;Microsoft Research Silicon Valley;Microsoft Research Silicon Valley;Columbia University;Microsoft Corporation;Microsoft Corporation

  • Venue:
  • USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software failures due to configuration errors are commonplace as computer systems continue to grow larger and more complex. Troubleshooting these configuration errors is a major administration cost, especially in server clusters where problems often go undetected without user interference. This paper presents CODE-a tool that automatically detects software configuration errors. Our approach is based on identifying invariant configuration access rules that predict what access events follow what contexts. It requires no source code, application-specific semantics, or heavyweight program analysis. Using these rules, CODE can sift through a voluminous number of events and detect deviant program executions. This is in contrast to previous approaches that focus on only diagnosis. In our experiments, CODE successfully detected a real configuration error in one of our deployment machines, in addition to 20 user-reported errors that we reproduced in our test environment. When analyzing month-long event logs from both user desktops and production servers, CODE yielded a low false positive rate. The efficiency of CODE makes it feasible to be deployed as a practical management tool with low overhead.