Evolutionary-based selection of generalized instances for imbalanced classification

  • Authors:
  • Salvador Garcıa;Joaquın Derrac;Isaac Triguero;Cristóbal J. Carmona;Francisco Herrera

  • Affiliations:
  • University of Jaén, Department of Computer Science, 23071 Jaén, Spain;University of Granada, Department of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), 18071 Granada, Spain;University of Granada, Department of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), 18071 Granada, Spain;University of Jaén, Department of Computer Science, 23071 Jaén, Spain;University of Granada, Department of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), 18071 Granada, Spain

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In supervised classification, we often encounter many real world problems in which the data do not have an equitable distribution among the different classes of the problem. In such cases, we are dealing with the so-called imbalanced data sets. One of the most used techniques to deal with this problem consists of preprocessing the data previously to the learning process. This paper proposes a method belonging to the family of the nested generalized exemplar that accomplishes learning by storing objects in Euclidean n-space. Classification of new data is performed by computing their distance to the nearest generalized exemplar. The method is optimized by the selection of the most suitable generalized exemplars based on evolutionary algorithms. An experimental analysis is carried out over a wide range of highly imbalanced data sets and uses the statistical tests suggested in the specialized literature. The results obtained show that our evolutionary proposal outperforms other classic and recent models in accuracy and requires to store a lower number of generalized examples.