Detecting outliers using rule-based modeling for improving CBR-based software quality classification models

Authors:
Taghi M. Khoshgoftaar;Lofton A. Bullard;Kehan Gao
Affiliations:
Florida Atlantic University, Boca Raton, Florida;Florida Atlantic University, Boca Raton, Florida;Florida Atlantic University, Boca Raton, Florida
Venue:
ICCBR'03 Proceedings of the 5th international conference on Case-based reasoning: Research and Development
Year:
2003

Citing 10
Cited 10

On learning from noisy and incomplete examples

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Handbook of software reliability engineering

Handbook of software reliability engineering
Balancing Misclassification Rates in Classification-TreeModels of Software Quality

Empirical Software Engineering
Conditions for Occam's Razor Applicability and Noise Elimination

ECML '97 Proceedings of the 9th European Conference on Machine Learning
Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois

ALT '96 Proceedings of the 7th International Workshop on Algorithmic Learning Theory
Software Metrics Model For Integrating Quality Control And Prediction

ISSRE '97 Proceedings of the Eighth International Symposium on Software Reliability Engineering
Biostatistical Analysis (5th Edition)

Biostatistical Analysis (5th Edition)
Using qualitative hypotheses to identify inaccurate data

Journal of Artificial Intelligence Research
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Identifying and eliminating mislabeled training instances

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Analyzing Software Measurement Data with Clustering Techniques

IEEE Intelligent Systems
An empirical study of predicting software faults with case-based reasoning

Software Quality Control
Determining noisy instances relative to attributes of interest

Intelligent Data Analysis
The pairwise attribute noise detection algorithm

Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Detecting noisy instances with the rule-based classification model

Intelligent Data Analysis
Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm

Intelligent Data Analysis
Noise elimination with partitioning filter for software quality estimation

International Journal of Computer Applications in Technology
The multiple imputation quantitative noise corrector

Intelligent Data Analysis
Class noise detection using frequent itemsets

Intelligent Data Analysis
Software mining and fault prediction

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deploying a software product that is of high quality is a major concern for the project management team. Significant research has been dedicated toward developing methods for improving the quality of metrics-based software quality classification models. Several studies have shown that the accuracy of such models improves when outliers and data noise are removed from the training data set. This study presents a new approach called Rule-Based Modeling (RBM) for detecting and removing training data outliers in an effort to improve the accuracy of a Case-Based Reasoning (CBR) classification model. We chose to study CBR models because of their sensitivity to outliers in the training data set. Furthermore, we wanted to affirmthe RBM technique as a viable outlier detector. We evaluate our approach by comparing the classification accuracy of CBR models built with and without removing outliers from the training data set. It is demonstrated that applying the RBM technique for eliminating outliers significantly improves the accuracy of CBR-based software quality classification models.