Research challenges of autonomic computing
Proceedings of the 27th international conference on Software engineering
Improved error reporting for software that uses black-box components
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
A survey of autonomic computing—degrees, models, and applications
ACM Computing Surveys (CSUR)
An Architecture for Supporting Network Fault Recovery Management
AIMS '08 Proceedings of the 2nd international conference on Autonomous Infrastructure, Management and Security: Resilient Networks and Services
Isolation points: Creating performance-robust enterprise systems
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Understanding customer problem troubleshooting from storage system logs
FAST '09 Proccedings of the 7th conference on File and storage technologies
Elicitation and utilization of application-level utility functions
ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Achieving Self-Healing in Autonomic Software Systems: a Case-Based Reasoning Approach
Proceedings of the 2005 conference on Self-Organization and Autonomic Informatics (I)
On the use of computational geometry to detect software faults at runtime
Proceedings of the 7th international conference on Autonomic computing
Finding similar failures using callstack similarity
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Empirical comparison of techniques for automated failure diagnosis
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Diagnosing new faults using mutants and prior faults (NIER track)
Proceedings of the 33rd International Conference on Software Engineering
I-queue: smart queues for service management
ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Case-based reasoning for autonomous service failure diagnosis and remediation in software systems
ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
ReBucket: a method for clustering duplicate crash reports based on call stack similarity
Proceedings of the 34th International Conference on Software Engineering
Predicting recurring crash stacks
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Performance troubleshooting in data centers: an annotated bibliography?
ACM SIGOPS Operating Systems Review
An empirical study on the use of mutant traces for diagnosis of faults in deployed systems
Journal of Systems and Software
Hi-index | 0.00 |
We present an architecture for and prototype of a system for quickly detecting software problem recurrences. Re-discovery of the same problem is very common in many large software products and is a major cost component of product support. At run-time, when a problem occurs, the system collects the problem symptoms, including the program call-stack, and compares it against a database of symptoms to find the closest matches. The database is populated off-line using solved cases and indexed to allow for efficient matching. Thus problems that occur repeatedly can be easily and automatically resolved without requiring any human problem-solving expertise. We describe a prototype implementation of the system, including the matching algorithm, and present some experimental results demonstrating the value of automatically detecting re-occurrence of the same problem for a popular sofware product.