International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
Knowledge in context: a strategy for expert system maintenance
AI '88 Proceedings of the second Australian joint conference on Artificial intelligence
Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Case-based reasoning
Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems
Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems
Machine Learning
Software metrics (2nd ed.): a rigorous and practical approach
Software metrics (2nd ed.): a rigorous and practical approach
Artificial Intelligence Review - Special issue on lazy learning
Discovering informative patterns and data cleaning
Advances in knowledge discovery and data mining
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Technical Note: Naive Bayes for Regression
Machine Learning
Ordinal association rules for error identification in data sets
Proceedings of the tenth international conference on Information and knowledge management
Balancing Misclassification Rates in Classification-TreeModels of Software Quality
Empirical Software Engineering
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Genetic Programming Model for Software Quality Classification
HASE '01 The 6th IEEE International Symposium on High-Assurance Systems Engineering: Special Topic: Impact of Networking
Generating Accurate Rule Sets Without Global Optimization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Experiments with Noise Filtering in a Medical Domain
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Comparison of Noise Handling Techniques
Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference
Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois
ALT '96 Proceedings of the 7th International Workshop on Algorithmic Learning Theory
Analogy-Based Practical Classification Rules for Software Quality Estimation
Empirical Software Engineering
Enhancing software quality estimation using ensemble-classifier based noise filtering
Intelligent Data Analysis
Using qualitative hypotheses to identify inaccurate data
Journal of Artificial Intelligence Research
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Performance Analysis of Class Noise Detection Algorithms
Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
Hi-index | 0.00 |
The poor quality of a training dataset can have untoward consequences in software quality estimation problems. The presence of noise in software measurement data may hinder the prediction accuracy of a given learner. A filter improves the quality of training datasets by removing data that is likely noise. We evaluate the Ensemble Filter against the Partitioning Filter and the Classification Filter. These filtering techniques combine the predictions of base classifiers in such a way that an instance is identified as noisy if it is misclassified by a given number of these learners. The Partitioning Filter first splits the training dataset into subsets, and different base learners are induced on each subset. Two different implementations of the Partitioning Filter are presented: the Multiple-Partitioning Filter and the Iterative-Partitioning Filter. In contrast, the Ensemble Filter uses base classifiers induced on the entire training dataset. The filtering level and/or the number of iterations modify the filtering conservativeness: a conservative filter is less likely to remove good data at the expense of retaining noisy instances. A unique measure for comparing the relative efficiencies of two filters is also presented. Empirical studies on a high assurance software project evaluate the relative performances of the Ensemble Filter, Multiple-Partitioning Filter, Iterative-Partitioning Filter, and Classification Filter. Our study demonstrates that with a conservative filtering approach, using several different base learners can improve the efficiency of the filtering schemes.