Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Automatic Failure-Path Inference: A Generic Introspection Technique for Internet Applications
WIAPP '03 Proceedings of the The Third IEEE Workshop on Internet Applications
Failure Diagnosis Using Decision Trees
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Quickly Finding Known Software Problems via Automated Symptom Matching
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Capturing, indexing, clustering, and retrieving system history
Proceedings of the twentieth ACM symposium on Operating systems principles
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Why do internet services fail, and what can be done about it?
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Processing forecasting queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Guided Problem Diagnosis through Active Learning
ICAC '08 Proceedings of the 2008 International Conference on Autonomic Computing
Fa: A System for Automating Failure Diagnosis
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Using computational intelligence to identify performance bottlenecks in a computer system
PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part I
Hi-index | 0.00 |
Automated techniques to diagnose the cause of system failures based on monitoring data is an active area of research at the intersection of systems and machine learning. In this paper, we identify three tasks that form key building blocks in automated diagnosis: 1. Identifying distinct states of the system using monitoring data. 2. Retrieving monitoring data from past system states that are similar to the current state. 3. Pinpointing attributes in the monitoring data that indicate the likely cause of a system failure. We provide (to our knowledge) the first apples-to-apples comparison of both classical and state-of-the-art techniques for these three tasks. Such studies are vital to the consolidation and growth of the field. Our study is based on a variety of failures injected in a multitier Web service. We present empirical insights and research opportunities.