Automated support for classifying software failure reports
Proceedings of the 25th International Conference on Software Engineering
The Journal of Machine Learning Research
Analyzing and Relating Bug Report Data for Feature Tracking
WCRE '03 Proceedings of the 10th Working Conference on Reverse Engineering
Simple BM25 extension to multiple weighted fields
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Proceedings of the 28th international conference on Software engineering
How long did it take to fix bugs?
Proceedings of the 2006 international workshop on Mining software repositories
Supporting change request assignment in open source development
Proceedings of the 2006 ACM symposium on Applied computing
Proceedings of the 5th international conference on Generative programming and component engineering
A Linguistic Analysis of How People Describe Software Problems
VLHCC '06 Proceedings of the Visual Languages and Human-Centric Computing
Optimisation methods for ranking functions with multiple parameters
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Detection of Duplicate Defect Reports Using Natural Language Processing
ICSE '07 Proceedings of the 29th international conference on Software Engineering
How Long Will It Take to Fix This Bug?
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Quality of bug reports in Eclipse
Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
Extracting structural information from bug reports
Proceedings of the 2008 international working conference on Mining software repositories
Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
A discriminative model approach for accurate duplicate bug report retrieval
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Towards more accurate retrieval of duplicate bug reports
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
A topic-based approach for narrowing the search space of buggy files from a bug report
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Search-based duplicate defect detection: an industrial experience
Proceedings of the 10th Working Conference on Mining Software Repositories
A statistical semantic language model for source code
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Hi-index | 0.00 |
Detecting duplicate bug reports helps reduce triaging efforts and save time for developers in fixing the same issues. Among several automated detection approaches, text-based information retrieval (IR) approaches have been shown to outperform others in term of both accuracy and time efficiency. However, those IR-based approaches do not detect well the duplicate reports on the same technical issues written in different descriptive terms. This paper introduces DBTM, a duplicate bug report detection approach that takes advantage of both IR-based features and topic-based features. DBTM models a bug report as a textual document describing certain technical issue(s), and models duplicate bug reports as the ones about the same technical issue(s). Trained with historical data including identified duplicate reports, it is able to learn the sets of different terms describing the same technical issues and to detect other not-yet-identified duplicate ones. Our empirical evaluation on real-world systems shows that DBTM improves the state-of-the-art approaches by up to 20% in accuracy.