Empirical comparison of techniques for automated failure diagnosis

Authors:
Songyun Duan;Shivnath Babu
Affiliations:
Department of Computer Science, Duke University;Department of Computer Science, Duke University
Venue:
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Year:
2008

Citing 13
Cited 1

Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Automatic Failure-Path Inference: A Generic Introspection Technique for Internet Applications

WIAPP '03 Proceedings of the The Third IEEE Workshop on Internet Applications
Failure Diagnosis Using Decision Trees

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Combining Visualization and Statistical Analysis to Improve Operator Confidence and Efficiency for Failure Detection and Localization

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Quickly Finding Known Software Problems via Automated Symptom Matching

ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Capturing, indexing, clustering, and retrieving system history

Proceedings of the twentieth ACM symposium on Operating systems principles
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
Correlating instrumentation data to system states: a building block for automated diagnosis and control

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Why do internet services fail, and what can be done about it?

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Processing forecasting queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Guided Problem Diagnosis through Active Learning

ICAC '08 Proceedings of the 2008 International Conference on Autonomic Computing
Fa: A System for Automating Failure Diagnosis

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering

Using computational intelligence to identify performance bottlenecks in a computer system

PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automated techniques to diagnose the cause of system failures based on monitoring data is an active area of research at the intersection of systems and machine learning. In this paper, we identify three tasks that form key building blocks in automated diagnosis: 1. Identifying distinct states of the system using monitoring data. 2. Retrieving monitoring data from past system states that are similar to the current state. 3. Pinpointing attributes in the monitoring data that indicate the likely cause of a system failure. We provide (to our knowledge) the first apples-to-apples comparison of both classical and state-of-the-art techniques for these three tasks. Such studies are vital to the consolidation and growth of the field. Our study is based on a variety of failures injected in a multitier Web service. We present empirical insights and research opportunities.