Software defect prediction using Bayesian networks

Authors:
Ahmet Okutan;Olcay Taner Yıldız
Affiliations:
Department of Computer Engineering, Işık University, Istanbul, Turkey and Meşrutiyet Koyu Universite Sokak, Istanbul, Turkey;Department of Computer Engineering, Işık University, Istanbul, Turkey and Meşrutiyet Koyu Universite Sokak, Istanbul, Turkey
Venue:
Empirical Software Engineering
Year:
2014

Citing 39
Cited 0

Towards a metrics suite for object oriented design

OOPSLA '91 Conference proceedings on Object-oriented programming systems, languages, and applications
The Detection of Fault-Prone Programs

IEEE Transactions on Software Engineering
A Bayesian Method for the Induction of Probabilistic Networks from Data

Machine Learning
Object-oriented metrics: measures of complexity

Object-oriented metrics: measures of complexity
A Critique of Software Defect Prediction Models

IEEE Transactions on Software Engineering
Empirical studies of software engineering: a roadmap

Proceedings of the Conference on The Future of Software Engineering
The prediction of faulty classes using object-oriented design metrics

Journal of Systems and Software
Comparing Software Prediction Techniques Using Simulation

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Two case studies of open source software development: Apache and Mozilla

ACM Transactions on Software Engineering and Methodology (TOSEM)
Software Measurement: Uncertainty and Causal Modeling

IEEE Software
Predicting Fault-Prone Modules with Case-Based Reasoning

ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Modeling software quality: the Software Measurement Analysis and Reliability Toolkit

ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
A Bayesian Belief Network for Assessing the Likelihood of Fault Content

ISSRE '03 Proceedings of the 14th International Symposium on Software Reliability Engineering
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Reliability and Validity in Comparative Studies of Software Prediction Models

IEEE Transactions on Software Engineering
An investigation of the effect of module size on defect prediction using static measures

PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Nearest neighbor sampling for better defect prediction

PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction

IEEE Transactions on Software Engineering
Software Defect Association Mining and Defect Correction Effort Prediction

IEEE Transactions on Software Engineering
Improving fault prediction using Bayesian networks for the development of embedded software applications: Research Articles

Software Testing, Verification & Reliability - UKTest 2005: The Third U.K. Workshop on Software Testing Research
Predicting software defects in varying development lifecycles using Bayesian nets

Information and Software Technology
Data Mining Static Code Attributes to Learn Defect Predictors

IEEE Transactions on Software Engineering
Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults

IEEE Transactions on Software Engineering
Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods

IEEE Transactions on Software Engineering
An empirical study of the impact of team size on software development effort

Information Technology and Management
The influence of organizational structure on software quality: an empirical case study

Proceedings of the 30th international conference on Software engineering
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
Tracking concept drift of software projects using defect prediction quality

MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
An Intelligent Model for Software Project Risk Prediction

ICIII '09 Proceedings of the 2009 International Conference on Information Management, Innovation Management and Industrial Engineering - Volume 01
Early Software Fault Prediction Using Real Time Defect Data

ICMV '09 Proceedings of the 2009 Second International Conference on Machine Vision
Reducing Features to Improve Bug Prediction

ASE '09 Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering
Applications of Support Vector Mathine and Unsupervised Learning for Predicting Maintainability Using Object-Oriented Metrics

MMIT '10 Proceedings of the 2010 Second International Conference on MultiMedia and Information Technology - Volume 01
Effects of the number of developers on code quality in open source software: a case study

Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Special issue on repeatable results in software engineering prediction

Empirical Software Engineering
Local vs. global models for effort estimation and defect prediction

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Ecological inference in empirical software engineering

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
No free lunch theorems for optimization

IEEE Transactions on Evolutionary Computation
Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers

IEEE Transactions on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

There are lots of different software metrics discovered and used for defect prediction in the literature. Instead of dealing with so many metrics, it would be practical and easy if we could determine the set of metrics that are most important and focus on them more to predict defectiveness. We use Bayesian networks to determine the probabilistic influential relationships among software metrics and defect proneness. In addition to the metrics used in Promise data repository, we define two more metrics, i.e. NOD for the number of developers and LOCQ for the source code quality. We extract these metrics by inspecting the source code repositories of the selected Promise data repository data sets. At the end of our modeling, we learn the marginal defect proneness probability of the whole software system, the set of most effective metrics, and the influential relationships among metrics and defectiveness. Our experiments on nine open source Promise data repository data sets show that response for class (RFC), lines of code (LOC), and lack of coding quality (LOCQ) are the most effective metrics whereas coupling between objects (CBO), weighted method per class (WMC), and lack of cohesion of methods (LCOM) are less effective metrics on defect proneness. Furthermore, number of children (NOC) and depth of inheritance tree (DIT) have very limited effect and are untrustworthy. On the other hand, based on the experiments on Poi, Tomcat, and Xalan data sets, we observe that there is a positive correlation between the number of developers (NOD) and the level of defectiveness. However, further investigation involving a greater number of projects is needed to confirm our findings.