Analysis of new variable selection methods for discriminant analysis

  • Authors:
  • Joaquín Pacheco;Silvia Casado;Laura Núñez;Olga Gómez

  • Affiliations:
  • Departament of Applied Economics, University of Burgos, Spain;Departament of Applied Economics, University of Burgos, Spain;Department of Finance, Instituto de Empresa. Business School. Madrid, Spain;Department of Finance, Instituto de Empresa. Business School. Madrid, Spain

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2006

Quantified Score

Hi-index 0.03

Visualization

Abstract

Several methods to select variables that are subsequently used in discriminant analysis are proposed and analysed. The aim is to find from among a set of m variables a smaller subset which enables an efficient classification of cases. Reducing dimensionality has some advantages such as reducing the costs of data acquisition, better understanding of the final classification model, and an increase in the efficiency and efficacy of the model itself. The specific problem consists in finding, for a small integer value of p, the size p subset of original variables that yields the greatest percentage of hits in the discriminant analysis. To solve this problem a series of techniques based on metaheuristic strategies is proposed. After performing some test it is found that they obtain significantly better results than the stepwise, backward or forward methods used by classic statistical packages. The way these methods work is illustrated with several examples.