Software Cost Estimation with Incomplete Data
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Modern Information Retrieval
Two case studies of open source software development: Apache and Mozilla
ACM Transactions on Software Engineering and Methodology (TOSEM)
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
Identifying Reasons for Software Changes Using Historic Databases
ICSM '00 Proceedings of the International Conference on Software Maintenance (ICSM'00)
Populating a Release History Database from Version Control and Bug Tracking Systems
ICSM '03 Proceedings of the International Conference on Software Maintenance
Analyzing and Relating Bug Report Data for Feature Tracking
WCRE '03 Proceedings of the 10th Working Conference on Reverse Engineering
Empirical Software Engineering
MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Automatic Identification of Bug-Introducing Changes
ASE '06 Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Predicting Faults from Cached History
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Detection of Duplicate Defect Reports Using Natural Language Processing
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Predicting Defects for Eclipse
PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
An approach to detecting duplicate bug reports using natural language and execution information
Proceedings of the 30th international conference on Software engineering
What do large commits tell us?: a taxonomical study of large commits
Proceedings of the 2008 international working conference on Mining software repositories
Data sets and data quality in software engineering
Proceedings of the 4th international workshop on Predictor models in software engineering
The secret life of bugs: Going past the errors and omissions in software repositories
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Fair and balanced?: bias in bug-fix datasets
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops
Benchmarking Lightweight Techniques to Link E-Mails and Source Code
WCRE '09 Proceedings of the 2009 16th Working Conference on Reverse Engineering
Linking e-mails and source code artifacts
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
A machine learning approach for text categorization of fixing-issue commits on CVS
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
The missing links: bugs and bug-fix commits
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
LINKSTER: enabling efficient manual inspection and annotation of mined data
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
ICSM '10 Proceedings of the 2010 IEEE International Conference on Software Maintenance
A Case Study of Bias in Bug-Fix Datasets
WCRE '10 Proceedings of the 2010 17th Working Conference on Reverse Engineering
Dealing with noise in defect prediction
Proceedings of the 33rd International Conference on Software Engineering
Bug prediction based on fine-grained module histories
Proceedings of the 34th International Conference on Software Engineering
Identifying Linux bug fixing patches
Proceedings of the 34th International Conference on Software Engineering
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Multi-layered approach for recovering links between bug reports and fixes
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
A hybrid bug triage algorithm for developer recommendation
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Proceedings of the 2013 International Conference on Software Engineering
It's not a bug, it's a feature: how misclassification impacts bug prediction
Proceedings of the 2013 International Conference on Software Engineering
Assisting code search with automatic query reformulation for bug localization
Proceedings of the 10th Working Conference on Mining Software Repositories
Sample size vs. bias in defect prediction
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
A cost-effectiveness criterion for applying software defect prediction models
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Hi-index | 0.01 |
Software defect information, including links between bugs and committed changes, plays an important role in software maintenance such as measuring quality and predicting defects. Usually, the links are automatically mined from change logs and bug reports using heuristics such as searching for specific keywords and bug IDs in change logs. However, the accuracy of these heuristics depends on the quality of change logs. Bird et al. found that there are many missing links due to the absence of bug references in change logs. They also found that the missing links lead to biased defect information, and it affects defect prediction performance. We manually inspected the explicit links, which have explicit bug IDs in change logs and observed that the links exhibit certain features. Based on our observation, we developed an automatic link recovery algorithm, ReLink, which automatically learns criteria of features from explicit links to recover missing links. We applied ReLink to three open source projects. ReLink reliably identified links with 89% precision and 78% recall on average, while the traditional heuristics alone achieve 91% precision and 64% recall. We also evaluated the impact of recovered links on software maintainability measurement and defect prediction, and found the results of ReLink yields significantly better accuracy than those of traditional heuristics.