Information-theoretic feature selection for functional data classification

  • Authors:
  • Vanessa Gómez-Verdejo;Michel Verleysen;Jérôme Fleury

  • Affiliations:
  • Department of Signal Theory and Communications, Universidad Carlos III de Madrid, Avda. Universidad 30, 28911 Leganés, Madrid, Spain;DICE - Machine Learning Group, Université Catholique de Louvain, 3 Place du Levant, B-1348 Louvain-la-Neuve, Belgium;Manufacture Française des Pneumatiques Michelin, Bít F32 - Site de Ladoux, 23 Place des Carmes-Deschaux, 63040 Clermont Ferrand, France

  • Venue:
  • Neurocomputing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

The classification of functional or high-dimensional data requires to select a reduced subset of features among the initial set, both to help fighting the curse of dimensionality and to help interpreting the problem and the model. The mutual information criterion may be used in that context, but it suffers from the difficulty of its estimation through a finite set of samples. Efficient estimators are not designed specifically to be applied in a classification context, and thus suffer from further drawbacks and difficulties. This paper presents an estimator of mutual information that is specifically designed for classification tasks, including multi-class ones. It is combined to a recently published stopping criterion in a traditional forward feature selection procedure. Experiments on both traditional benchmarks and on an industrial functional classification problem show the added value of this estimator.