Classification of Fault-Prone Software Modules: Prior Probabilities,Costs, and Model Evaluation

Authors:
Taghi M. Khoshgoftaar;Edward B. Allen
Affiliations:
Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431;Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431
Venue:
Empirical Software Engineering
Year:
1998

Citing 13
Cited 19

A Spiral Model of Software Development and Enhancement

Computer
Learning from Examples: Generation and Evaluation of Decision Trees for Software Resource Analysis

IEEE Transactions on Software Engineering - Special Issue on Artificial Intelligence in Software Applications
Applied multivariate statistical analysis

Applied multivariate statistical analysis
Methodology for Validating Software Metrics

IEEE Transactions on Software Engineering
The Detection of Fault-Prone Programs

IEEE Transactions on Software Engineering
Developing Interpretable Models with Optimized set Reduction for Identifying High-Risk Software Components

IEEE Transactions on Software Engineering - Special issue on software reliability
A neural network approach for early detection of program modules having high risk in the maintenance phase

Selected papers of the sixth annual Oregon workshop on Software metrics
A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Using Process History to Predict Software Quality

Computer
Early Quality Prediction: A Case Study in Telecommunications

IEEE Software
A tree-based classification model for analysis of a military software system

HASE '96 Proceedings of the 1996 High-Assurance Systems Engineering Workshop
The Impact of Costs of Misclassification on Software Quality Modeling

METRICS '97 Proceedings of the 4th International Symposium on Software Metrics
Multivariate assessment of complex software systems: a comparative study

ICECCS '95 Proceedings of the 1st International Conference on Engineering of Complex Computer Systems

A Comparative Study of Ordering and Classification of Fault-ProneSoftware Modules

Empirical Software Engineering
Controlling Overfitting in Classification-Tree Models ofSoftware Quality

Empirical Software Engineering
Balancing Misclassification Rates in Classification-TreeModels of Software Quality

Empirical Software Engineering
Cost-Benefit Analysis of Software Quality Models

Software Quality Control
Improving Tree-Based Models of Software Quality with Principal Components Analysis

ISSRE '00 Proceedings of the 11th International Symposium on Software Reliability Engineering
The pairwise attribute noise detection algorithm

Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Studying software metrics based on real-world software systems

Journal of Computing Sciences in Colleges
Applying machine learning to software fault-proneness prediction

Journal of Systems and Software
A comprehensive empirical evaluation of missing value imputation in noisy software measurement data

Journal of Systems and Software
The multiple imputation quantitative noise corrector

Intelligent Data Analysis
Misclassification cost-sensitive fault prediction models

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Hybrid sampling for imbalanced data

Integrated Computer-Aided Engineering - Selected papers from the IEEE Conference on Information Reuse and Integration (IRI), July 13-15, 2008
Class noise detection using frequent itemsets

Intelligent Data Analysis
Knowledge discovery from imbalanced and noisy data

Data & Knowledge Engineering
Empirical case studies in attribute noise detection

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews - Special issue on information reuse and integration
Variance analysis in software fault prediction models

ISSRE'09 Proceedings of the 20th IEEE international conference on software reliability engineering
Resource-sensitive intrusion detection models for network traffic

HASE'04 Proceedings of the Eighth IEEE international conference on High assurance systems engineering
Positive vectors clustering using inverted Dirichlet finite mixture models

Expert Systems with Applications: An International Journal
An empirical study of the classification performance of learners on imbalanced and noisy software quality data

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Softwarequality models can give timely predictions of reliability indicators,for targeting software improvement efforts. In some cases, classificationtechniques are sufficient for useful software quality models.The software engineeringcommunity has not applied informed prior probabilities widelyto software quality classification modeling studies. Moreover,even though costs are of paramount concern to software managers,costs of misclassification have received little attention inthe software engineering literature. This paper applies informedprior probabilities and costs of misclassification to softwarequality classification. We also discuss the advantages and limitationsof several statistical methods for evaluating the accuracy ofsoftware quality classification models.We conducted two full-scale industrial case studies which integratedthese concepts with nonparametric discriminant analysis to illustratehow they can be used by a classification technique. The casestudies supported our hypothesis that classification models ofsoftware quality can benefit by considering informed prior probabilitiesand by minimizing the expected cost of misclassifications. Thecase studies also illustrated the advantages and limitationsof resubstitution, cross-validation, and data splitting for modelevaluation.