New heuristics in feature selection for high dimensional data

  • Authors:
  • Roberto Ruiz

  • Affiliations:
  • School of Engineering, Pablo de Olavide University, Ctra. Utrera km. 1, 41013 Seville, Spain E-mail: robertoruiz@upo.es

  • Venue:
  • AI Communications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Massive data sets have become common in many applications making the task of finding an optimum subset of attributes extremely difficult. Traditional feature selection techniques can be very inefficient in high dimensional data, especially when the subset evaluation is obtained through a learning algorithm. We describe a method based on the statistical significance of adding a feature from a ranked-list to the final subset. To measure individual feature, we propose a new simple and fast criterion based on the projections of data set elements onto each attribute.