Analogy-Based Practical Classification Rules for Software Quality Estimation

Authors:
Taghi M. Khoshgoftaar;Naeem Seliya
Affiliations:
Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431 taghi@cse.fau.edu;Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431 nseliya@cse.fau.edu
Venue:
Empirical Software Engineering
Year:
2003

Citing 21
Cited 20

Developing Interpretable Models with Optimized set Reduction for Identifying High-Risk Software Components

IEEE Transactions on Software Engineering - Special issue on software reliability
Case-based reasoning

Case-based reasoning
A neural network approach for early detection of program modules having high risk in the maintenance phase

Selected papers of the sixth annual Oregon workshop on Software metrics
A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
An Experiment to Assess the Cost-Benefits of Code Inspections in Large Scale Software Development

IEEE Transactions on Software Engineering
Estimating Software Project Effort Using Analogies

IEEE Transactions on Software Engineering
Comparing Software Prediction Techniques Using Simulation

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Machine Learning and Data Mining; Methods and Applications

Machine Learning and Data Mining; Methods and Applications
Early Risk-Management by Identification of Fault-prone Modules

Empirical Software Engineering
Software Metrics Data Analysis—Exploring the RelativePerformance of Some Commonly Used Modeling Techniques

Empirical Software Engineering
Balancing Misclassification Rates in Classification-TreeModels of Software Quality

Empirical Software Engineering
Emerald: Software Metrics and Models on the Desktop

IEEE Software
Data Mining and Knowledge Discovery: Making Sense Out of Data

IEEE Expert: Intelligent Systems and Their Applications
Assessing the applicability of fault-proneness models across object-oriented software projects

IEEE Transactions on Software Engineering
Investigation of Logistic Regression as a Discriminant of Software Quality

METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Software Metrics Model For Integrating Quality Control And Prediction

ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Predicting Fault-Prone Modules with Case-Based Reasoning

ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Improving Tree-Based Models of Software Quality with Principal Components Analysis

ISSRE '00 Proceedings of the 11th International Symposium on Software Reliability Engineering
Modeling software quality: the Software Measurement Analysis and Reliability Toolkit

ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
Fuzzy logic techniques for software reliability engineering

Fuzzy logic techniques for software reliability engineering
Application of multivariate analysis for software fault prediction

Software Quality Control

Analyzing Software Measurement Data with Clustering Techniques

IEEE Intelligent Systems
Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

Empirical Software Engineering
Resource-oriented software quality classification models

Journal of Systems and Software
An empirical study of predicting software faults with case-based reasoning

Software Quality Control
Enhancing software quality estimation using ensemble-classifier based noise filtering

Intelligent Data Analysis
Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm

Intelligent Data Analysis
Evaluating noise elimination techniques for software quality estimation

Intelligent Data Analysis
Software quality estimation with limited fault data: a semi-supervised learning perspective

Software Quality Control
Improving software quality prediction by noise filtering techniques

Journal of Computer Science and Technology
Mining software repositories for comprehensible software fault prediction models

Journal of Systems and Software
A comprehensive empirical evaluation of missing value imputation in noisy software measurement data

Journal of Systems and Software
Noise elimination with partitioning filter for software quality estimation

International Journal of Computer Applications in Technology
An early software-quality classification based on improved grey relational classifier

Expert Systems with Applications: An International Journal
Class noise detection using frequent itemsets

Intelligent Data Analysis
On the relative value of cross-company and within-company data for defect prediction

Empirical Software Engineering
A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Journal of Systems and Software
Searching for rules to detect defective modules: A subgroup discovery approach

Information Sciences: an International Journal
An evolutionary programming based asymmetric weighted least squares support vector machine ensemble learning methodology for software repository mining

Information Sciences: an International Journal
Using code change types in an analogy-based classifier for short-term defect prediction

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
A study of subgroup discovery approaches for defect prediction

Information and Software Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software metrics-based quality estimation models can be effective tools for identifying which modules are likely to be fault-prone or not fault-prone. The use of such models prior to system deployment can considerably reduce the likelihood of faults discovered during operations, hence improving system reliability. A software quality classification model is calibrated using metrics from a past release or similar project, and is then applied to modules currently under development. Subsequently, a timely prediction of which modules are likely to have faults can be obtained. However, software quality classification models used in practice may not provide a useful balance between the two misclassification rates, especially when there are very few faulty modules in the system being modeled.This paper presents, in the context of case-based reasoning, two practical classification rules that allow appropriate emphasis on each type of misclassification as per the project requirements. The suggested techniques are especially useful for high-assurance systems where faulty modules are rare. The proposed generalized classification methods emphasize on the costs of misclassifications, and the unbalanced distribution of the faulty program modules. We illustrate the proposed techniques with a case study that consists of software measurements and fault data collected over multiple releases of a large-scale legacy telecommunication system. In addition to investigating the two classification methods, a brief relative comparison of the techniques is also presented. It is indicated that the level of classification accuracy and model-robustness observed for the case study would be beneficial in achieving high software reliability of its subsequent system releases. Similar observations are made from our empirical studies with other case studies.