Improved Duplicate Bug Report Identification

Authors:
Yuan Tian;Chengnian Sun;David Lo
Affiliations:
-;-;-
Venue:
CSMR '12 Proceedings of the 2012 16th European Conference on Software Maintenance and Reengineering
Year:
2012

Citing 0
Cited 3

Search-based duplicate defect detection: an industrial experience

Proceedings of the 10th Working Conference on Mining Software Repositories
An analysis of requirements evolution in open source projects: recommendations for issue trackers

Proceedings of the 2013 International Workshop on Principles of Software Evolution
Leveraging machine learning and information retrieval techniques in software evolution tasks: summary of the first MALIR-SE workshop, at ASE 2013

ACM SIGSOFT Software Engineering Notes

Quantified Score

Hi-index	0.00

Visualization

Abstract

Bugs are prevalent in software systems. To improve the reliability of software systems, developers often allow end users to provide feedback on bugs that they encounter. Users could perform this by sending a bug report in a bug report management system like Bugzilla. This process however is uncoordinated and distributed, which means that many users could submit bug reports reporting the same problem. These are referred to as duplicate bug reports. The existence of many duplicate bug reports may cause much unnecessary manual efforts as often a triager would need to manually tag bug reports as being duplicates. Recently, there have been a number of studies that investigate duplicate bug report problem which in effect answer the following question: given a new bug report, retrieve k other similar bug reports. This, however, still requires substantive manual effort which could be reduced further. Jalbert and Weimer are the first to introduce the direct detection of duplicate bug reports, it answers the question: given a new bug report, classify if it as a duplicate bug report or not. In this paper, we extend Jalbert and Weimer's work by improving the accuracy of automated duplicate bug report identification. We experiments with bug reports from Mozilla bug tracking system which were reported between February 2005 to October 2005, and find that we could improve the accuracy of the previous approach by about 160%.