Combining feature selection and feature reduction for protein classification

Authors:
Ricco Rakotomalala;Faouzi Mhamdi
Affiliations:
ERIC, University of Lyon 2, Bron, France;URPAH, University of Tunis, Tunis, Tunisia
Venue:
SMO'06 Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization
Year:
2006

Citing 3
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
On the use of the singular value decomposition for text retrieval

Computational information retrieval
An introduction to variable and feature selection

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use the n-grams descriptors for a protein classification task. As they are automatically generated, we obtain many irrelevant and/or redundant descriptors. In this paper, we evaluate various strategies of feature selection and feature reduction. First, we evaluate separately the efficiency of a filtering feature selection algorithm and a feature reduction on the basis of a singular value decomposition process (SVD). Then, we evaluate the combination of the two approaches i.e. we propose to use a very tolerant filter to select on a univariate basis which attributes to include in the subsequent SVD. We expect that the features extracted from relevant descriptors should allow to build a better classifier. We experiment the various approaches on two non-linear classifiers: a 3-nearest neighbor which is very sensitive to high dimensionality, and a SVM with a RBF kernel function which is well regularized. The results show that the behavior of the approaches depends mainly on the supervised learning characteristics.