C4.5: programs for machine learning
C4.5: programs for machine learning
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Towards More Optimal Medical Diagnosing with Evolutionary Algorithms
Journal of Medical Systems
Decision Trees: An Overview and Their Use in Medicine
Journal of Medical Systems
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Improving Mining of Medical Data by Outliers Prediction
CBMS '05 Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems
Hi-index | 0.00 |
Software fault prediction methods are very appropriate for improving the software reliability. With the creation of large empirical databases of software projects, as a result of stimulated research on estimation models, metrics and methods for measuring and improving processes and products, intelligent mining of these datasets can largely add to the improvement of software reliability. In the paper we present a study on using decision tree classifiers for predicting software faults. A new training set filtering method is presented that should improve the classification performance when mining the software complexity measures data. The classification improvement should be achieved by removing the identified outliers from a training set. We argue that a classifier trained by a filtered dataset captures a more general knowledge model and should therefore perform better also on unseen cases. The proposed method is applied on a real-world software reliability analysis dataset and the obtained results are discussed.