Misclassification cost-sensitive fault prediction models

Authors:
Yue Jiang;Bojan Cukic
Affiliations:
West Virginia University, Morgantown, WV;West Virginia University, Morgantown, WV
Venue:
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Year:
2009

Citing 44
Cited 3

Understanding and Controlling Software Costs

IEEE Transactions on Software Engineering
A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Software metrics (2nd ed.): a rigorous and practical approach

Software metrics (2nd ed.): a rigorous and practical approach
Predicting Fault-Prone Software Modules in Telephone Switches

IEEE Transactions on Software Engineering
A Critique of Software Defect Prediction Models

IEEE Transactions on Software Engineering
Incorporating varying test costs and fault severities into test case prioritization

ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Test Case Prioritization: A Family of Empirical Studies

IEEE Transactions on Software Engineering
Random Forests

Machine Learning
An empirical evaluation of fault-proneness models

Proceedings of the 24th International Conference on Software Engineering
Classification of Fault-Prone Software Modules: Prior Probabilities,Costs, and Model Evaluation

Empirical Software Engineering
Cost-Benefit Analysis of Software Quality Models

Software Quality Control
Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects

IEEE Transactions on Software Engineering
Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques

Empirical Software Engineering
Requirements Volatility and Defect Density

ISSRE '99 Proceedings of the 10th International Symposium on Software Reliability Engineering
An Application of Zero-Inflated Poisson Regression for Software Fault Prediction

ISSRE '01 Proceedings of the 12th International Symposium on Software Reliability Engineering
Analyzing Software Measurement Data with Clustering Techniques

IEEE Intelligent Systems
A study to investigate the impact of requirements instability on software defects

ACM SIGSOFT Software Engineering Notes
Robust Prediction of Fault-Proneness by Random Forests

ISSRE '04 Proceedings of the 15th International Symposium on Software Reliability Engineering
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems

IEEE Transactions on Software Engineering
Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: A Case Study of OpenBSD

METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
Empirical Assessment of Machine Learning based Software Defect Prediction Techniques

WORDS '05 Proceedings of the 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems
Cost curves: An improved method for visualizing classifier performance

Machine Learning
A Unified Framework for Defect Data Analysis Using the MBR Technique

ICTAI '06 Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence
Using Historical In-Process and Product Metrics for Early Estimation of Software Failures

ISSRE '06 Proceedings of the 17th International Symposium on Software Reliability Engineering
Estimating Software Quality with Advanced Data Mining Techniques

ICSEA '06 Proceedings of the International Conference on Software Engineering Advances
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Data Mining Static Code Attributes to Learn Defect Predictors

IEEE Transactions on Software Engineering
Adequate and Precise Evaluation of Quality Models in Software Engineering Studies

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Using Developer Information as a Factor for Fault Prediction

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
How to measure success of fault prediction models

Fourth international workshop on Software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting
Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'"

IEEE Transactions on Software Engineering
A Multivariate Analysis of Static Code Attributes for Defect Prediction

QSIC '07 Proceedings of the Seventh International Conference on Quality Software
Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software

ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Fault Prediction using Early Lifecycle Data

ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
The influence of organizational structure on software quality: an empirical case study

Proceedings of the 30th international conference on Software engineering
Comparing design and code metrics for software quality prediction

Proceedings of the 4th international workshop on Predictor models in software engineering
Can data transformation help in the detection of fault-prone modules?

DEFECTS '08 Proceedings of the 2008 workshop on Defects in large software systems
An empirical investigation of tree ensembles in biometrics and bioinformatics research

An empirical investigation of tree ensembles in biometrics and bioinformatics research
Techniques for evaluating fault prediction models

Empirical Software Engineering
Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

IEEE Transactions on Software Engineering
Cost Curve Evaluation of Fault Prediction Models

ISSRE '08 Proceedings of the 2008 19th International Symposium on Software Reliability Engineering
Unsupervised learning for expert-based software quality estimation

HASE'04 Proceedings of the Eighth IEEE international conference on High assurance systems engineering

Classification cost: An empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost

Expert Systems with Applications: An International Journal
Classification model for detecting and managing credit loan fraud based on individual-level utility concept

ACM SIGMIS Database
Classification model for detecting and managing credit loan fraud based on individual-level utility concept

ACM SIGMIS Database

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditionally, software fault prediction models are built by assuming a uniform misclassification cost. In other words, cost implications of misclassifying a faulty module as fault free are assumed to be the same as the cost implications of misclassifying a fault free module as faulty. In reality, these two types of misclassification costs are rarely equal. They are project-specific, reflecting the characteristics of the domain in which the program operates. In this paper, using project information from a public repository, we analyze the benefits of techniques which incorporate misclassification costs in the development of software fault prediction models. We find that cost-sensitive learning does not provide operational points which outperform cost-insensitive classifiers. However, an advantage of cost-sensitive modeling is the explicit choice of the operational threshold appropriate for the cost differential.