Combining feature selection and feature reduction for protein classification

  • Authors:
  • Ricco Rakotomalala;Faouzi Mhamdi

  • Affiliations:
  • ERIC, University of Lyon 2, Bron, France;URPAH, University of Tunis, Tunis, Tunisia

  • Venue:
  • SMO'06 Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We use the n-grams descriptors for a protein classification task. As they are automatically generated, we obtain many irrelevant and/or redundant descriptors. In this paper, we evaluate various strategies of feature selection and feature reduction. First, we evaluate separately the efficiency of a filtering feature selection algorithm and a feature reduction on the basis of a singular value decomposition process (SVD). Then, we evaluate the combination of the two approaches i.e. we propose to use a very tolerant filter to select on a univariate basis which attributes to include in the subsequent SVD. We expect that the features extracted from relevant descriptors should allow to build a better classifier. We experiment the various approaches on two non-linear classifiers: a 3-nearest neighbor which is very sensitive to high dimensionality, and a SVM with a RBF kernel function which is well regularized. The results show that the behavior of the approaches depends mainly on the supervised learning characteristics.