MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Mining version archives for co-changed lines
Proceedings of the 2006 international workshop on Mining software repositories
Automatic Identification of Bug-Introducing Changes
ASE '06 Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering
Identifying Changed Source Code Lines from Version Repositories
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Branching and merging in the repository
Proceedings of the 2008 international working conference on Mining software repositories
Identifying static analysis techniques for finding non-fix hunks in fix revisions
Proceedings of the ACM first international workshop on Data-intensive software management and mining
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
WhoseFault: automatic developer-to-fault assignment through fault localization
Proceedings of the 34th International Conference on Software Engineering
Hi-index | 0.00 |
Automatically identifying commits that induce fixes is an important task, as it enables researchers to quickly and efficiently validate many types of software engineering analyses, such as software metrics or models for predicting faulty components. Previous work on SZZ, an algorithm designed by Sliwerski et al and improved upon by Kim et al, provides a process for automatically identifying the fix-inducing predecessor lines to lines that are changed in a bug-fixing commit. However, as of yet no one has verified that the fix-inducing lines identified by SZZ are in fact responsible for introducing the fixed bug. Also, the SZZ algorithm relies on annotation graphs, which are imprecise in the face of large blocks of modified code, for back-tracking through previous revisions to the fix-inducing change. In this work we outline several improvements to the SZZ algorithm: First, we replace annotation graphs with line-number maps that track unique source lines as they change over the lifetime of the software; and second, we use DiffJ, a Java syntax-aware diff tool, to ignore comments and formatting changes in the source. Finally, we begin verifying how often a fix-inducing change identified by SZZ is the true source of a bug.