Lucene in Action (In Action series)
Lucene in Action (In Action series)
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Coping with an open bug repository
eclipse '05 Proceedings of the 2005 OOPSLA workshop on Eclipse technology eXchange
Proceedings of the 28th international conference on Software engineering
Supporting change request assignment in open source development
Proceedings of the 2006 ACM symposium on Applied computing
Detection of Duplicate Defect Reports Using Natural Language Processing
ICSE '07 Proceedings of the 29th international conference on Software Engineering
How Long Will It Take to Fix This Bug?
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Information needs in bug reports: improving cooperation between developers and users
Proceedings of the 2010 ACM conference on Computer supported cooperative work
A discriminative model approach for accurate duplicate bug report retrieval
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
An Introduction to Duplicate Detection
An Introduction to Duplicate Detection
International Journal of Open Source Software and Processes
Hi-index | 0.00 |
Background: Bug Tracking Repositories, such as Bugzilla, are designed to support fault reporting for developers, testers and users of the system. Allowing anyone to contribute finding and reporting faults has an immediate impact on software quality. However, this benefit comes with at least one side-effect. Users often file reports that describe the same fault. This increases the maintainer's triage time, but important information required to fix the fault is likely contributed by different reports. Aim: The objective of this paper is twofold. First, we want to understand the dynamics of bug report filing for a large, long duration open source project, Firefox. Second, we present a new approach that can reduce the number of duplicate reports. Method: The novel element in the proposed approach is the ability to concentrate the search for duplicates on specific portions of the bug repository. Our system can be deployed as a search tool to help reporters query the repository. Results: When tested as a search tool our system is able to detect up to 53% of duplicate reports. Conclusion: The performance of Information Retrieval techniques can be significantly improved by guiding the search for duplicates. This approach results in higher detection rates and constant classification runtime.