Proceedings of the third international conference on Genetic algorithms
Recursive Automatic Bias Selection for Classifier Construction
Machine Learning - Special issue on bias evaluation and selection
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
On Issues of Instance Selection
Data Mining and Knowledge Discovery
Advances in Instance Selection for Instance-Based Learning Algorithms
Data Mining and Knowledge Discovery
Combining Control Strategies Using Genetic Algorithms with Memory
EP '97 Proceedings of the 6th International Conference on Evolutionary Programming VI
Population-Based Incremental Learning: A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning
Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
K-Nearest Neighbor Finding Using MaxNearestDist
IEEE Transactions on Pattern Analysis and Machine Intelligence
A comparative study on rough set based class imbalance learning
Knowledge-Based Systems
A divide-and-conquer recursive approach for scaling up instance selection algorithms
Data Mining and Knowledge Discovery
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy
Evolutionary Computation
Class Conditional Nearest Neighbor for Large Margin Instance Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate Nearest Subspace Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
MILIS: Multiple Instance Learning with Instance Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
Although many more complex learning algorithms exist, k-nearest neighbor (k-NN) is still one of the most successful classifiers in real-world applications. One of the ways of scaling up the k-nearest neighbors classifier to deal with huge datasets is instance selection. Due to the constantly growing amount of data in almost any pattern recognition task, we need more efficient instance selection algorithms, which must achieve larger reductions while maintaining the accuracy of the selected subset. However, most instance selection method do not work well in class imbalanced problems. Most algorithms tend to remove too many instances from the minority class. In this paper we present a way to improve instance selection for class imbalanced problems by allowing the algorithms to select instances more than once. In this way, the fewer instances of the minority can cover more portions of the space, and the same testing error of the standard approach can be obtained faster and with fewer instances. No other constraint is imposed on the instance selection method. An extensive comparison using 40 datasets from the UCI Machine Learning Repository shows the usefulness of our approach compared with the established method of evolutionary instance selection. Our method is able to, in the worst case, match the error obtained by standard instance selection with a larger reduction and shorter execution time.