Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Machine Learning
Improved heterogeneous distance functions
Journal of Artificial Intelligence Research
Computers in Biology and Medicine
Hi-index | 0.00 |
We studied three different methods to improve identification of small classes, which are also difficult to classify, by balancing an imbalanced class distribution with data reduction. The new method, neighborhood cleaning (NCL) rule, outperformed simple random sampling within classes and one-sided selection method in the experiments with ten real world data sets. All reduction methods improved clearly identification of small classes (20--30%) true-positive rates of the three-nearest neighbor method and the C4.5 decision tree generator, but the differences between the methods were insignificant. However, the significant differences in accuracies, true-positive rates, and true-negative rates obtained from the reduced data were in favor of our method. The results suggest that the NCL rule is a useful method for improving modeling of difficult small classes, as well as for building classifiers that identify these classes from the real world data which frequently have an imbalanced class distribution.