Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to algorithms
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Quickly Finding Known Software Problems via Automated Symptom Matching
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Failure proximity: a fault localization-based approach
Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Windows XP kernel crash analysis
LISA '06 Proceedings of the 20th conference on Large Installation System Administration
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
ReCrash: Making Software Failures Reproducible by Preserving Object States
ECOOP '08 Proceedings of the 22nd European conference on Object-Oriented Programming
Automatically Identifying Known Software Problems
ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
HOLMES: Effective statistical debugging via efficient path profiling
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
A comparison of extrinsic clustering evaluation metrics based on formal constraints
Information Retrieval
DebugAdvisor: a recommender system for debugging
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Debugging in the (very) large: ten years of implementation and experience
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Finding similar failures using callstack similarity
SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
IEEE Transactions on Software Engineering
Crash graphs: An aggregated view of multiple crashes to improve crash triage
DSN '11 Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems&Networks
Software analytics in practice: mini tutorial
Proceedings of the 34th International Conference on Software Engineering
Predicting recurring crash stacks
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Griffin: grouping suspicious memory-access patterns to improve understanding of concurrency bugs
Proceedings of the 2013 International Symposium on Software Testing and Analysis
Fault comprehension for concurrent programs
Proceedings of the 2013 International Conference on Software Engineering
Hi-index | 0.00 |
Software often crashes. Once a crash happens, a crash report could be sent to software developers for investigation upon user permission. To facilitate efficient handling of crashes, crash reports received by Microsoft's Windows Error Reporting (WER) system are organized into a set of "buckets". Each bucket contains duplicate crash reports that are deemed as manifestations of the same bug. The bucket information is important for prioritizing efforts to resolve crashing bugs. To improve the accuracy of bucketing, we propose ReBucket, a method for clustering crash reports based on call stack matching. ReBucket measures the similarities of call stacks in crash reports and then assigns the reports to appropriate buckets based on the similarity values. We evaluate ReBucket using crash data collected from five widely-used Microsoft products. The results show that ReBucket achieves better overall performance than the existing methods. In average, the F-measure obtained by ReBucket is about 0.88.