Evaluating the effectiveness of reliability-assurance techniques
Journal of Systems and Software
Predicting Fault Incidence Using Software Change History
IEEE Transactions on Software Engineering
The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics
IEEE Transactions on Software Engineering
An empirical evaluation of fault-proneness models
Proceedings of the 24th International Conference on Software Engineering
Assessing the applicability of fault-proneness models across object-oriented software projects
IEEE Transactions on Software Engineering
Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Analogy-Based Practical Classification Rules for Software Quality Estimation
Empirical Software Engineering
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study
Empirical Software Engineering
Robust Prediction of Fault-Proneness by Random Forests
ISSRE '04 Proceedings of the 15th International Symposium on Software Reliability Engineering
Use of relative code churn measures to predict system defect density
Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems
IEEE Transactions on Software Engineering
Object-oriented software fault prediction using neural networks
Information and Software Technology
Using Developer Information as a Factor for Fault Prediction
PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Automating algorithms for the identification of fault-prone files
Proceedings of the 2007 international symposium on Software testing and analysis
How to measure success of fault prediction models
Fourth international workshop on Software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting
Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study
ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods
IEEE Transactions on Software Engineering
Applying machine learning to software fault-proneness prediction
Journal of Systems and Software
Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Mining software repositories for comprehensible software fault prediction models
Journal of Systems and Software
Predicting defect-prone software modules using support vector machines
Journal of Systems and Software
Classifying Software Changes: Clean or Buggy?
IEEE Transactions on Software Engineering
An object-oriented high-level design-based class cohesion metric
Information and Software Technology
Towards a software failure cost impact model for the customer: an analysis of an open source product
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Review: Software fault prediction: A literature review and current trends
Expert Systems with Applications: An International Journal
Information and Software Technology
Optimizing cost and quality by integrating inspection and test processes
Proceedings of the 2011 International Conference on Software and Systems Process
Improving the applicability of object-oriented class cohesion metrics
Information and Software Technology
A genetic algorithm to configure support vector machines for predicting fault-prone components
PROFES'11 Proceedings of the 12th international conference on Product-focused software process improvement
High-impact defects: a study of breakage and surprise defects
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
BugCache for inspections: hit or miss?
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
A Precise Method-Method Interaction-Based Cohesion Metric for Object-Oriented Classes
ACM Transactions on Software Engineering and Methodology (TOSEM)
Information and Software Technology
Applying the Mahalanobis-Taguchi strategy for software defect diagnosis
Automated Software Engineering
Searching for rules to detect defective modules: A subgroup discovery approach
Information Sciences: an International Journal
Ecological inference in empirical software engineering
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Evaluating defect prediction approaches: a benchmark and an extensive comparison
Empirical Software Engineering
Information and Software Technology
Reducing test effort: A systematic mapping study on existing approaches
Information and Software Technology
Bug prediction based on fine-grained module histories
Proceedings of the 34th International Conference on Software Engineering
Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Recalling the "imprecision" of cross-project defect prediction
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
How, and why, process metrics are better
Proceedings of the 2013 International Conference on Software Engineering
Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis
Proceedings of the 2013 International Conference on Software Engineering
Sample size vs. bias in defect prediction
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Information and Software Technology
A study of subgroup discovery approaches for defect prediction
Information and Software Technology
Is lines of code a good measure of effort in effort-aware models?
Information and Software Technology
Object-oriented class maintainability prediction using internal quality attributes
Information and Software Technology
An in-depth study of the potentially confounding effect of class size in fault prediction
ACM Transactions on Software Engineering and Methodology (TOSEM)
Proceedings of the 23rd international conference on World wide web
Prediction of faults-slip-through in large software projects: an empirical evaluation
Software Quality Control
DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
Automated Software Engineering
Hi-index | 0.00 |
This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high fault probability. The system under consideration is constantly evolving as several releases a year are shipped to customers. Developers usually have limited resources for their testing and would like to devote extra resources to faulty system parts. The main research focus of this paper is to systematically assess three aspects on how to build and evaluate fault-proneness models in the context of this large Java legacy system development project: (1) compare many data mining and machine learning techniques to build fault-proneness models, (2) assess the impact of using different metric sets such as source code structural measures and change/fault history (process measures), and (3) compare several alternative ways of assessing the performance of the models, in terms of (i) confusion matrix criteria such as accuracy and precision/recall, (ii) ranking ability, using the receiver operating characteristic area (ROC), and (iii) our proposed cost-effectiveness measure (CE). The results of the study indicate that the choice of fault-proneness modeling technique has limited impact on the resulting classification accuracy or cost-effectiveness. There is however large differences between the individual metric sets in terms of cost-effectiveness, and although the process measures are among the most expensive ones to collect, including them as candidate measures significantly improves the prediction models compared with models that only include structural measures and/or their deltas between releases - both in terms of ROC area and in terms of CE. Further, we observe that what is considered the best model is highly dependent on the criteria that are used to evaluate and compare the models. And the regular confusion matrix criteria, although popular, are not clearly related to the problem at hand, namely the cost-effectiveness of using fault-proneness prediction models to focus verification efforts to deliver software with less faults at less cost.