Statistical analysis with missing data
Statistical analysis with missing data
The nature of statistical learning theory
The nature of statistical learning theory
Convex Optimization
LS Bound based gene selection for DNA microarray data
Bioinformatics
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Gene extraction for cancer diagnosis by support vector machines-An improvement
Artificial Intelligence in Medicine
Gene selection from microarray data for cancer classification-a machine learning approach
Computational Biology and Chemistry
A wrapper method for feature selection using Support Vector Machines
Information Sciences: an International Journal
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Hi-index | 0.00 |
Objective: The problem of marker selection in DNA microarray analysis has been addressed so far by two basic types of approaches, the so-called filter and wrapper methods. Wrapper methods operate in a recursive fashion where feature (gene) weights are re-evaluated and dynamically changing from iteration to iteration, while in filter methods feature weights remain fixed. Our objective in this study is to show that the application of filter criteria in a recursive fashion, where weights are potentially adjusted from cycle to cycle, produces noticeable improvement on the generalization performance measured on independent test sets. Methods and materials: Toward this direction we explore the behavior of two well known and broadly accepted pattern recognition approaches namely the support vector machines (SVM) and a single linear neuron (LN), properly adapted to the problem of marker selection. Within this context we also show how the kernel ability of SVM could be employed in a practical manner to provide alternative ways to approach the problem of reliable marker selection. Results: We explore how the proposed approaches behave in two application domains (breast cancer and leukemia), achieving comparable or even better results than those reported in the related bibliography. An important advantage of these approaches is their ability to derive stable performance without deteriorating due to the complexity of the application domain. Validation is performed using internal leave one out (ILOO) and 10-fold cross validation as well as independent test set evaluation. Conclusions: Results show that the proposed methodologies achieve remarkable performance and indicate that applying filter criteria in a wrapper fashion ('wrapper filtering criteria') provides a useful tool for marker selection. The contribution of this study is threefold. First it provides a methodology to apply filter criteria in a wrapper way (which is a new approach), second it introduces a fundamental pattern recognition component namely the single neuron (which is a linear estimator) and explores its behavior on marker selection and third it demonstrates an approach to exploit the kernel ability of SVMs in a practical and effective manner.