Quickly Finding Known Software Problems via Automated Symptom Matching

Authors:
Guy Lohman;Jon Champlin;Peter Sohn
Affiliations:
IBM Almaden Research Center;Lotus Development Lab;Lotus Development Lab
Venue:
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Year:
2005

Citing 0
Cited 20

Research challenges of autonomic computing

Proceedings of the 27th international conference on Software engineering
Improved error reporting for software that uses black-box components

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
A survey of autonomic computing—degrees, models, and applications

ACM Computing Surveys (CSUR)
An Architecture for Supporting Network Fault Recovery Management

AIMS '08 Proceedings of the 2nd international conference on Autonomous Infrastructure, Management and Security: Resilient Networks and Services
Isolation points: Creating performance-robust enterprise systems

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Understanding customer problem troubleshooting from storage system logs

FAST '09 Proccedings of the 7th conference on File and storage technologies
Elicitation and utilization of application-level utility functions

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Achieving Self-Healing in Autonomic Software Systems: a Case-Based Reasoning Approach

Proceedings of the 2005 conference on Self-Organization and Autonomic Informatics (I)
On the use of computational geometry to detect software faults at runtime

Proceedings of the 7th international conference on Autonomic computing
Finding similar failures using callstack similarity

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Empirical comparison of techniques for automated failure diagnosis

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
F007: finding rediscovered faults from the field using function-level failed traces of software in the field

Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Diagnosing new faults using mutants and prior faults (NIER track)

Proceedings of the 33rd International Conference on Software Engineering
Synthesis of application-level utility functions for autonomic self-assessment

Cluster Computing
I-queue: smart queues for service management

ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Case-based reasoning for autonomous service failure diagnosis and remediation in software systems

ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
ReBucket: a method for clustering duplicate crash reports based on call stack similarity

Proceedings of the 34th International Conference on Software Engineering
Predicting recurring crash stacks

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Performance troubleshooting in data centers: an annotated bibliography?

ACM SIGOPS Operating Systems Review
An empirical study on the use of mutant traces for diagnosis of faults in deployed systems

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an architecture for and prototype of a system for quickly detecting software problem recurrences. Re-discovery of the same problem is very common in many large software products and is a major cost component of product support. At run-time, when a problem occurs, the system collects the problem symptoms, including the program call-stack, and compares it against a database of symptoms to find the closest matches. The database is populated off-line using solved cases and indexed to allow for efficient matching. Thus problems that occur repeatedly can be easily and automatically resolved without requiring any human problem-solving expertise. We describe a prototype implementation of the system, including the matching algorithm, and present some experimental results demonstrating the value of automatically detecting re-occurrence of the same problem for a popular sofware product.