The nature of statistical learning theory
The nature of statistical learning theory
Discriminant Adaptive Nearest Neighbor Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning
Feature Selection via Concave Minimization and Support Vector Machines
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Regularized simultaneous model selection in multiple quantiles regression
Computational Statistics & Data Analysis
Computational Statistics & Data Analysis
Monte Carlo Statistical Methods
Monte Carlo Statistical Methods
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiclass classification and gene selection with a stochastic algorithm
Computational Statistics & Data Analysis
Pattern recognition via projection-based kNN rules
Computational Statistics & Data Analysis
A two step method to identify clinical outcome relevant genes with microarray data
Journal of Biomedical Informatics
Expert Systems with Applications: An International Journal
Simultaneous sample and gene selection using t-score and approximate support vectors
PRIB'13 Proceedings of the 8th IAPR international conference on Pattern Recognition in Bioinformatics
International Journal of Data Mining and Bioinformatics
Hi-index | 0.03 |
Since most cancer treatments come with a certain degree of toxicity it is very essential to identify a cancer type correctly and then administer the relevant therapy. With the arrival of powerful tools such as gene expression microarrays the cancer classification basis is slowly changing from morphological properties to molecular signatures. Several recent studies have demonstrated a marked improvement in prediction accuracy of tumor types based on gene expression microarray measurements over clinical markers. The main challenge in working with gene expression microarrays is that there is a huge number of genes to work with. Out of them only a small fraction are actually relevant for differentiating between different types of cancer. A Bayesian nearest neighbor model equipped with an integrated variable selection technique is proposed to overcome this challenge. This classification and gene selection model is able to classify different cancer types accurately and simultaneously identify the relevant or important genes. The proposed model is completely automatic in the sense that it adaptively picks up the neighborhood size and the important covariates. The method is successfully applied to three simulated data sets and four well known real data sets. To demonstrate the competitiveness of the method a comparative study is also done with several other ''off the shelf'' popular classification methods. For all the simulated data sets and real life data sets, the proposed method produced highly competitive if not better results. While the standard approach is two step model building for gene selection and then tumor prediction, this novel adaptive gene selection technique automatically selects the relevant genes along with tumor class prediction in one go. The biological relevance of the selected genes are also discussed to validate the claim.