A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
An introduction to variable and feature selection
The Journal of Machine Learning Research
The feature selection problem: traditional methods and a new algorithm
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Dimensionality reduction using genetic algorithms
IEEE Transactions on Evolutionary Computation
A genetic embedded approach for gene selection and classification of microarray data
EvoBIO'07 Proceedings of the 5th European conference on Evolutionary computation, machine learning and data mining in bioinformatics
RISC: a new filter approach for feature selection from proteomic data
ICMB'08 Proceedings of the 1st international conference on Medical biometrics
A study of crossover operators for gene selection of microarray data
EA'07 Proceedings of the Evolution artificielle, 8th international conference on Artificial evolution
Hi-index | 0.00 |
Computational analysis of mass spectrometric (MS) proteomic data from sera is of potential relevance for diagnosis, prognosis, choice of therapy, and study of disease activity. To this aim, feature selection techniques based on machine learning can be applied for detecting potential biomarkes and biomaker patterns. A key issue concerns the interpretability and robustness of the output results given by such techniques. In this paper we propose a robust method for feature selection with MS proteomic data. The method consists of the sequentail application of a filter feature selection algorithm, RELIEF, followed by multiple runs of a wrapper feature selection technique based on support vector machines (SVM), where each run is obtained by changing the class label of one support vector. Frequencies of features selected over the runs are used to identify features which are robust with respect to perturbations of the data. This method is tested on a dataset produced by a specific MS technique, called MALDI-TOF MS. Two classes have been artificially generated by spiking. Moreover, the samples have been collected at different storage durations. Leave-one-out cross validation (LOOCV) applied to the resulting dataset, indicates that the proposed feature selection method is capable of identifying highly discriminatory proteomic patterns.