Robust SVM-based biomarker selection with noisy mass spectrometric proteomic data

Authors:
Elena Marchiori;Connie R. Jimenez;Mikkel West-Nielsen;Niels H. H. Heegaard
Affiliations:
Department of Computer Science, Vrije Universiteit Amsterdam, The Netherlands;Department of Molecular and Cellular Neurobiology, Vrije Universiteit Amsterdam, The Netherlands;Department of Autoimmunology, Statens Serum Institut, Copenhagen, Denmark;Department of Autoimmunology, Statens Serum Institut, Copenhagen, Denmark
Venue:
EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Year:
2006

Citing 8
Cited 3

A practical approach to feature selection

ML92 Proceedings of the ninth international workshop on Machine learning
Feature Extraction, Construction and Selection: A Data Mining Perspective

Feature Extraction, Construction and Selection: A Data Mining Perspective
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Leave One Out Error, Stability, and Generalization of Voting Combinations of Classifiers

Machine Learning
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
The feature selection problem: traditional methods and a new algorithm

AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Dimensionality reduction using genetic algorithms

IEEE Transactions on Evolutionary Computation

A genetic embedded approach for gene selection and classification of microarray data

EvoBIO'07 Proceedings of the 5th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
RISC: a new filter approach for feature selection from proteomic data

ICMB'08 Proceedings of the 1st international conference on Medical biometrics
A study of crossover operators for gene selection of microarray data

EA'07 Proceedings of the Evolution artificielle, 8th international conference on Artificial evolution

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computational analysis of mass spectrometric (MS) proteomic data from sera is of potential relevance for diagnosis, prognosis, choice of therapy, and study of disease activity. To this aim, feature selection techniques based on machine learning can be applied for detecting potential biomarkes and biomaker patterns. A key issue concerns the interpretability and robustness of the output results given by such techniques. In this paper we propose a robust method for feature selection with MS proteomic data. The method consists of the sequentail application of a filter feature selection algorithm, RELIEF, followed by multiple runs of a wrapper feature selection technique based on support vector machines (SVM), where each run is obtained by changing the class label of one support vector. Frequencies of features selected over the runs are used to identify features which are robust with respect to perturbations of the data. This method is tested on a dataset produced by a specific MS technique, called MALDI-TOF MS. Two classes have been artificially generated by spiking. Moreover, the samples have been collected at different storage durations. Leave-one-out cross validation (LOOCV) applied to the resulting dataset, indicates that the proposed feature selection method is capable of identifying highly discriminatory proteomic patterns.