A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set

  • Authors:
  • Elon S. Correa;Alex A. Freitas;Colin G. Johnson

  • Affiliations:
  • University of Kent, Canterbury, UK;University of Kent, Canterbury, UK;University of Kent, Canterbury, UK

  • Venue:
  • Proceedings of the 8th annual conference on Genetic and evolutionary computation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many data mining applications involve the task of building a model for predictive classification. The goal of such a model is to classify examples (records or data instances) into classes or categories of the same type. The use of variables (attributes) not related to the classes can reduce the accuracy and reliability of a classification or prediction model. Superuous variables can also increase the costs of building a model - particularly on large data sets. We propose a discrete Particle Swarm Optimization (PSO) algorithm designed for attribute selection. The proposed algorithm deals with discrete variables, and its population of candidate solutions contains particles of different sizes. The performance of this algorithm is compared with the performance of a standard binary PSO algorithm on the task of selecting attributes in a bioinformatics data set. The criteria used for comparison are: (1) maximizing predictive accuracy; and (2) finding the smallest subset of attributes.