Comparing fine-grained source code changes and code churn for bug prediction

Authors:
Emanuel Giger;Martin Pinzger;Harald C. Gall
Affiliations:
University of Zurich, Switzerland, Switzerland;Delft University of Technology, Netherlands, Netherlands;University of Zurich, Switzerland, Switzerland
Venue:
Proceedings of the 8th Working Conference on Mining Software Repositories
Year:
2011

Citing 36
Cited 7

A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Building Knowledge through Families of Experiments

IEEE Transactions on Software Engineering
A Critique of Software Defect Prediction Models

IEEE Transactions on Software Engineering
Predicting Fault Incidence Using Software Change History

IEEE Transactions on Software Engineering
Detection of software modules with high debug code churn in a very large legacy system

ISSRE '96 Proceedings of the The Seventh International Symposium on Software Reliability Engineering
Populating a Release History Database from Version Control and Bug Tracking Systems

ICSM '03 Proceedings of the International Conference on Software Maintenance
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
When do changes induce fixes?

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Mining metrics to predict component failures

Proceedings of the 28th international conference on Software engineering
Classifying Change Types for Qualifying Change Couplings

ICPC '06 Proceedings of the 14th IEEE International Conference on Program Comprehension
Predicting defect densities in source code files with decision tree learners

Proceedings of the 2006 international workshop on Mining software repositories
YALE: rapid prototyping for complex data mining tasks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Predicting component failures at design time

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Understanding the shape of Java software

Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Data Mining Static Code Attributes to Learn Defect Predictors

IEEE Transactions on Software Engineering
Predicting Defects for Eclipse

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Letters: Support vector machine interpretation

Neurocomputing
Improving defect prediction using temporal features and non linear models

Ninth international workshop on Principles of software evolution: in conjunction with the 6th ESEC/FSE joint meeting
Change Distilling: Tree Differencing for Fine-Grained Source Code Change Extraction

IEEE Transactions on Software Engineering
Comments on "Data Mining Static Code Attributes to Learn Defect Predictors"

IEEE Transactions on Software Engineering
Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'"

IEEE Transactions on Software Engineering
A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction

Proceedings of the 30th international conference on Software engineering
Predicting defects using network analysis on dependency graphs

Proceedings of the 30th international conference on Software engineering
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
Can developer-module networks predict failures?

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Change Analysis with Evolizer and ChangeDistiller

IEEE Software
Data Mining Methods and Models

Data Mining Methods and Models
Does distributed development affect software quality? An empirical case study of Windows Vista

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Cross-project defect prediction: a large scale experiment on data vs. domain vs. process

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Fair and balanced?: bias in bug-fix datasets

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Discovering Patterns of Change Types

ASE '08 Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
PASW Statistics 18 Advanced Statistical Procedures

PASW Statistics 18 Advanced Statistical Procedures
Studying the impact of dependency network measures on software quality

ICSM '10 Proceedings of the 2010 IEEE International Conference on Software Maintenance
Change Bursts as Defect Predictors

ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks

IEEE Transactions on Neural Networks

Using the gini coefficient for bug prediction in eclipse

Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Report on the fourth workshop on hot topics in software upgrades (HotSWUp 2012)

ACM SIGOPS Operating Systems Review
The MSR cookbook: mining a decade of research

Proceedings of the 10th Working Conference on Mining Software Repositories
Replicating mining studies with SOFAS

Proceedings of the 10th Working Conference on Mining Software Repositories
Using citation influence to predict software defects

Proceedings of the 10th Working Conference on Mining Software Repositories
Data stream mining for predicting software build outcomes using source code metrics

Information and Software Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

A significant amount of research effort has been dedicated to learning prediction models that allow project managers to efficiently allocate resources to those parts of a software system that most likely are bug-prone and therefore critical. Prominent measures for building bug prediction models are product measures, e.g., complexity or process measures, such as code churn. Code churn in terms of lines modified (LM) and past changes turned out to be significant indicators of bugs. However, these measures are rather imprecise and do not reflect all the detailed changes of particular source code entities during maintenance activities. In this paper, we explore the advantage of using fine-grained source code changes (SCC) for bug prediction. SCC captures the exact code changes and their semantics down to statement level. We present a series of experiments using different machine learning algorithms with a dataset from the Eclipse platform to empirically evaluate the performance of SCC and LM. The results show that SCC outperforms LM for learning bug prediction models.