Detecting bug duplicate reports through local references

  • Authors:
  • Tomi Prifti;Sean Banerjee;Bojan Cukic

  • Affiliations:
  • West Virginia University, Morgantown, WV;West Virginia University, Morgantown, WV;West Virginia University, Morgantown, WV

  • Venue:
  • Proceedings of the 7th International Conference on Predictive Models in Software Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Background: Bug Tracking Repositories, such as Bugzilla, are designed to support fault reporting for developers, testers and users of the system. Allowing anyone to contribute finding and reporting faults has an immediate impact on software quality. However, this benefit comes with at least one side-effect. Users often file reports that describe the same fault. This increases the maintainer's triage time, but important information required to fix the fault is likely contributed by different reports. Aim: The objective of this paper is twofold. First, we want to understand the dynamics of bug report filing for a large, long duration open source project, Firefox. Second, we present a new approach that can reduce the number of duplicate reports. Method: The novel element in the proposed approach is the ability to concentrate the search for duplicates on specific portions of the bug repository. Our system can be deployed as a search tool to help reporters query the repository. Results: When tested as a search tool our system is able to detect up to 53% of duplicate reports. Conclusion: The performance of Information Retrieval techniques can be significantly improved by guiding the search for duplicates. This approach results in higher detection rates and constant classification runtime.