ReLink: recovering links between bugs and changes

  • Authors:
  • Rongxin Wu;Hongyu Zhang;Sunghun Kim;Shing-Chi Cheung

  • Affiliations:
  • Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;The Hong Kong University of Science and Technology, Hong Kong, Hong Kong;The Hong Kong University of Science and Technology, Hong Kong, Hong Kong

  • Venue:
  • Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Software defect information, including links between bugs and committed changes, plays an important role in software maintenance such as measuring quality and predicting defects. Usually, the links are automatically mined from change logs and bug reports using heuristics such as searching for specific keywords and bug IDs in change logs. However, the accuracy of these heuristics depends on the quality of change logs. Bird et al. found that there are many missing links due to the absence of bug references in change logs. They also found that the missing links lead to biased defect information, and it affects defect prediction performance. We manually inspected the explicit links, which have explicit bug IDs in change logs and observed that the links exhibit certain features. Based on our observation, we developed an automatic link recovery algorithm, ReLink, which automatically learns criteria of features from explicit links to recover missing links. We applied ReLink to three open source projects. ReLink reliably identified links with 89% precision and 78% recall on average, while the traditional heuristics alone achieve 91% precision and 64% recall. We also evaluated the impact of recovered links on software maintainability measurement and defect prediction, and found the results of ReLink yields significantly better accuracy than those of traditional heuristics.