The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Discovering informative patterns and data cleaning
Advances in knowledge discovery and data mining
Scaling up dynamic time warping for datamining applications
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning
An accelerated procedure for recursive feature ranking on microarray data
Neural Networks - 2003 Special issue: Advances in neural networks research IJCNN'03
Variable selection using svm based criteria
The Journal of Machine Learning Research
Integrating gene expression profiling and clinical data
International Journal of Approximate Reasoning
Robust Feature Selection for Microarray Data Based on Multicriterion Fusion
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Semisupervised profiling of gene expressions and clinical data
WILF'05 Proceedings of the 6th international conference on Fuzzy Logic and Applications
MaskedPainter: Feature selection for microarray data analysis
Intelligent Data Analysis
Hi-index | 0.00 |
Class prediction and feature selection are two learning tasks that are strictly paired in the search of molecular profiles from microarray data. Researchers have become aware how easy it is to incur a selection bias effect, and complex validation setups are required to avoid overly optimistic estimates of the predictive accuracy of the models and incorrect gene selections. This paper describes a semisupervised pattern discovery approach that uses the by-products of complete validation studies on experimental setups for gene profiling. In particular, we introduce the study of the patterns of single sample responses (sample-tracking profiles) to the gene selection process induced by typical supervised learning tasks in microarray studies. We originate sample-tracking profiles as the aggregated off-training evaluation of SVM models of increasing gene panel sizes. Genes are ranked by E-RFE, an entropy-based variant of the recursive feature elimination for support vector machines (RFE-SVM). A Dynamic Time Warping (DTW) algorithm is then applied to define a metric between sample-tracking profiles. An unsupervised clustering based on the DTW metric allows automating the discovery of outliers and of subtypes of different molecular profiles. Applications are described on synthetic data and in two gene expression studies.