Automatically Identifying Known Software Problems

Authors:
Natwar Modani;Rajeev Gupta;Guy Lohman;Tanveer Syeda-Mahmood;Laurent Mignet
Affiliations:
IBM India Research Lab, Block-1, IIT Delhi Campus, New Delhi, India. namodani@in.ibm.com;IBM India Research Lab, Block-1, IIT Delhi Campus, New Delhi, India. grajeev@in.ibm.com;IBM Almaden Research Center, San Jose, CA, USA. lohman@almaden.ibm.com;IBM Almaden Research Center, San Jose, CA, USA. stf@almaden.ibm.com;IBM India Research Lab, Block-1, IIT Delhi Campus, New Delhi, India. lamignet@in.ibm.com
Venue:
ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
Year:
2007

Citing 0
Cited 3

Finding similar failures using callstack similarity

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
ReBucket: a method for clustering duplicate crash reports based on call stack similarity

Proceedings of the 34th International Conference on Software Engineering
Predicting recurring crash stacks

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Re-occurrence of the same problem is very common in many large software products. By matching the symptoms of a new problem to those in a database of known problems, automated diagnosis and even self-healing for re-occurrences can be (partially) realized. This paper exploits function call stacks as highly structured symptoms of a certain class of problems, including crashes, hangs, and traps. We propose and evaluate algorithms for efficiently and accurately matching call stacks by a weighted metric of the similarity of their function names, after first removing redundant recursion and uninformative (poor discriminator) functions from those stacks. We also describe a new indexing scheme to speed queries to the repository of known problems, without compromising the quality of matches returned. Experiments conducted using call stacks from actual product problem reports demonstrate the improved accuracy (both precision and recall) resulting from our new stack-matching algorithms and removal of uninformative or redundant function names, as well as the performance and scalability improvements realized by indexing call stacks. We also discuss how call-stack matching can be used in both self-managing (or autonomic systems) and human "help desk" applications.