A cooperative coevolutionary algorithm for instance selection for instance-based learning

  • Authors:
  • Nicolás García-Pedrajas;Juan Antonio Romero Del Castillo;Domingo Ortiz-Boyer

  • Affiliations:
  • Department of Computing and Numerical Analysis, University of Córdoba, Córdoba, Spain 14071;Department of Computing and Numerical Analysis, University of Córdoba, Córdoba, Spain 14071;Department of Computing and Numerical Analysis, University of Córdoba, Córdoba, Spain 14071

  • Venue:
  • Machine Learning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a cooperative evolutionary approach for the problem of instance selection for instance based learning. The model presented takes advantage of one of the recent paradigms in the field of evolutionary computation: cooperative coevolution. This paradigm is based on a similar approach to the philosophy of divide and conquer. In our method, the training set is divided into several subsets that are searched independently. A population of global solutions relates the search in different subsets and keeps track of the best combinations obtained. The proposed model has the advantage over standard methods in that it does not rely on any specific distance metric or classifier algorithm. Additionally, the fitness function of the individuals considers both storage requirements and classification accuracy, and the user can balance both objectives depending on his/her specific needs, assigning different weights to each one of these two terms. The method also shows good scalability when applied to large datasets.The proposed model is favorably compared with some of the most successful standard algorithms, IB3, ICF and DROP3, with a genetic algorithm using CHC method, and with four recent methods of instance selection, MSS, entropy-based instance selection, IMOEA and LVQPRU. The comparison shows a clear advantage of the proposed algorithm in terms of storage requirements, and is, at least, as good as any of the other methods in terms of testing error. A large set of 50 problems from the UCI Machine Learning Repository is used for the comparison. Additionally, a study of the effect of instance label noise is carried out, showing the robustness of the proposed algorithm.The major contribution of our work is showing that cooperative coevolution can be used to tackle large problems taking advantage of its inherently modular nature. We show that a combination of cooperative coevolution together with the principle of divide-and-conquer can be very effective both in terms of improving performance and in reducing computational cost.