Predicting buggy changes inside an integrated development environment
Proceedings of the 2007 OOPSLA workshop on eclipse technology eXchange
Predicting build failures using social network analysis on developer communication
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Characterizing software architecture changes: A systematic review
Information and Software Technology
Journal of Systems and Software
Empirical Evaluation of Hunk Metrics as Bug Predictors
IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
Fault-prone module detection using large-scale text features based on spam filtering
Empirical Software Engineering
A machine learning approach for text categorization of fixing-issue commits on CVS
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Towards a software failure cost impact model for the customer: an analysis of an open source product
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Mining source codes to guide software development
ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part I
MACs: Mining API code snippets for code reuse
Expert Systems with Applications: An International Journal
Non-essential changes in version histories
Proceedings of the 33rd International Conference on Software Engineering
Dealing with noise in defect prediction
Proceedings of the 33rd International Conference on Software Engineering
Pragmatic prioritization of software quality assurance efforts
Proceedings of the 33rd International Conference on Software Engineering
Reliability analysis and optimal version-updating for open source software
Information and Software Technology
Improving the applicability of object-oriented class cohesion metrics
Information and Software Technology
Nothing else matters: what predictive model should I use?
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Micro interaction metrics for defect prediction
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Change impact analysis in product-line architectures
ECSA'11 Proceedings of the 5th European conference on Software architecture
Information and Software Technology
Analyzing the impact of change in multi-threaded programs
FASE'10 Proceedings of the 13th international conference on Fundamental Approaches to Software Engineering
Prioritizing tests for fault localization through ambiguity group reduction
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Empirical Software Engineering
Bug prediction based on fine-grained module histories
Proceedings of the 34th International Conference on Software Engineering
Predicting recurring crash stacks
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Support vector machines for anti-pattern detection
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Defect, defect, defect: defect prediction 2.0
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Is it dangerous to use version control histories to study source code evolution?
ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
Predicting defect numbers based on defect state transition models
Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
How do software engineers understand code changes?: an exploratory study in industry
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
An industrial study on the risk of software changes
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Multi-layered approach for recovering links between bug reports and fixes
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
A Recovery-Oriented Approach for Software Fault Diagnosis in Complex Critical Systems
International Journal of Adaptive, Resilient and Autonomic Systems
Are Developers Fixing Their Own Bugs?: Tracing Bug-Fixing and Bug-Seeding Committers
International Journal of Open Source Software and Processes
Predicting method crashes with bytecode operations
Proceedings of the 6th India Software Engineering Conference
Proceedings of the 2013 International Conference on Software Engineering
It's not a bug, it's a feature: how misclassification impacts bug prediction
Proceedings of the 2013 International Conference on Software Engineering
Predicting bug-fixing time: an empirical study of commercial software projects
Proceedings of the 2013 International Conference on Software Engineering
The impact of tangled code changes
Proceedings of the 10th Working Conference on Mining Software Repositories
Using code change types in an analogy-based classifier for short-term defect prediction
Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Injecting mechanical faults to localize developer faults for evolving software
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
A goal driven framework for software project data analytics
CAiSE'13 Proceedings of the 25th international conference on Advanced Information Systems Engineering
An Empirical Analysis of Software Changes on Statement Entity in Java Open Source Projects
International Journal of Open Source Software and Processes
Hi-index | 0.01 |
This paper introduces a new technique for finding latent software bugs called change classification. Change classification uses a machine learning classifier to determine whether a new software change is more similar to prior buggy changes, or clean changes. In this manner, change classification predicts the existence of bugs in software changes. The classifier is trained using features (in the machine learning sense) extracted from the revision history of a software project, as stored in its software configuration management repository. The trained classifier can classify changes as buggy or clean with 78% accuracy and 65% buggy change recall (on average). Change classification has several desirable qualities: (1) the prediction granularity is small (a change to a single file), (2) predictions do not require semantic information about the source code, (3) the technique works for a broad array of project types and programming languages, and (4) predictions can be made immediately upon completion of a change. Contributions of the paper include a description of the change classification approach, techniques for extracting features from source code and change histories, a characterization of the performance of change classification across 12 open source projects, and evaluation of the predictive power of different groups of features.