A preliminary study on the selection of generalized instances for imbalanced classification

Authors:
Salvador García;Joaquín Derrac;Isaac Triguero;Cristóbal Carmona;Francisco Herrera
Affiliations:
University of Jaén, Department of Computer Science, Jaén, Spain;University of Granada, Department of Computer Science and Artificial Intelligence, Granada, Spain;University of Granada, Department of Computer Science and Artificial Intelligence, Granada, Spain;University of Jaén, Department of Computer Science, Jaén, Spain;University of Granada, Department of Computer Science and Artificial Intelligence, Granada, Spain
Venue:
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
Year:
2010

Citing 15
Cited 0

Instance-Based Learning Algorithms

Machine Learning
A Nearest Hyperrectangle Learning Method

Machine Learning
An Experimental Comparison of the Nearest-Neighbor and Nearest-Hyperrectangle Algorithms

Machine Learning
Unifying instance-based and rule-based induction

Machine Learning
Data preparation for data mining

Data preparation for data mining
Separate-and-Conquer Rule Learning

Artificial Intelligence Review
Data Mining and Knowledge Discovery with Evolutionary Algorithms

Data Mining and Knowledge Discovery with Evolutionary Algorithms
Introduction to Evolutionary Computing

Introduction to Evolutionary Computing
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Evolutionary Computation in Data Mining (Studies in Fuzziness and Soft Computing)

Evolutionary Computation in Data Mining (Studies in Fuzziness and Soft Computing)
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

Soft Computing - A Fusion of Foundations, Methodologies and Applications
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy

Evolutionary Computation
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning in imbalanced domains is one of the recent challenges in machine learning and data mining. In imbalanced classification, data sets present many examples from one class and few from the other class, and the latter class is the one which receives more interest from the point of view of learning. One of the most used techniques to deal with this problem consists in preprocessing the data previously to the learning process. This contribution proposes a method belonging to the family of the nested generalized exemplar that accomplishes learning by storing objects in Euclidean n-space. Classification of new data is performed by computing their distance to the nearest generalized exemplar. The method is optimized by the selection of the most suitable generalized exemplars based on evolutionary algorithms. The proposal is compared with the most representative nested generalized exemplar learning approaches and the results obtained show that our evolutionary proposal outperforms them in accuracy and requires to store a lower number of generalized examples.