An evolutionary approach for high dimensional attribute selection

  • Authors:
  • Lydia Boudjeloud-Assala

  • Affiliations:
  • Laboratory of Theoretical and Applied of Computer Science, University of Lorraine, LITA EA 3097, Ile du Saulcy, Metz Cedex 01, F-57045, France

  • Venue:
  • International Journal of Intelligent Information and Database Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method to select a relevant dimension subset (with few or no loss of information) for clustering and outlier detection in high dimensional datasets. We use a heuristic search for relevant dimension subset selection based on genetic algorithm. The genetic algorithm fitness function for clustering uses the validity indexes of classification algorithms. We first use these validity indexes to select a dimension subset and then, to evaluate the clustering quality in this subspace. For outlier detection, the genetic algorithm fitness function is an individual distance-based function. The performances of our new approach of dimension selection are evaluated on simulations with different high dimensional datasets for the two applications (clustering and outlier detection). Furthermore, as the number of dimensions is low, it is possible to display the datasets in order to visually evaluate and interpret the obtained results.