A two step method to identify clinical outcome relevant genes with microarray data

Authors:
Bin Han;Lihua Li;Yan Chen;Lei Zhu;Qi Dai
Affiliations:
Institute for Biomedical Engineering and Instruments, School of Automation, Hangzhou Dianzi University, PR China;Institute for Biomedical Engineering and Instruments, School of Automation, Hangzhou Dianzi University, PR China;Institute for Biomedical Engineering and Instruments, School of Automation, Hangzhou Dianzi University, PR China;Institute for Biomedical Engineering and Instruments, School of Automation, Hangzhou Dianzi University, PR China;Institute for Biomedical Engineering and Instruments, School of Automation, Hangzhou Dianzi University, PR China
Venue:
Journal of Biomedical Informatics
Year:
2011

Citing 15
Cited 0

Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Causal Discovery from Changes

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Optimization models for cancer classification: extracting gene interaction information from microarray expression data

Bioinformatics
Extracting gene regulation information for cancer classification

Pattern Recognition
Selecting differentially expressed genes using minimum probability of classification error

Journal of Biomedical Informatics
An integrated algorithm for gene selection and classification applied to microarray data of ovarian cancer

Artificial Intelligence in Medicine
A review of feature selection techniques in bioinformatics

Bioinformatics
Monte Carlo feature selection for supervised classification

Bioinformatics
Identification of gene transcript signatures predictive for estrogen receptor and lymph node status using a stepwise forward selection artificial neural network modelling approach

Artificial Intelligence in Medicine
Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks

Bioinformatics
Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes

Bioinformatics
Simultaneous cancer classification and gene selection with Bayesian nearest neighbor method: An integrated approach

Computational Statistics & Data Analysis
Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Bayesian binary kernel probit model for microarray based cancer classification and gene selection

Computational Statistics & Data Analysis
Gene feature extraction using T-test statistics and kernel partial least squares

ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

With advances in microarray technology, many biomarkers selection approaches have been proposed for cancer diagnosis. Marker sets are selected by scoring genes for how well they can discriminate between different classes of diseases [1-4] or are ranked by significance analysis without reference to classification tasks. However there is a pressing need for methods integrating biological priori knowledge in the gene selection process. In this study, we proposed to identify genes primarily in terms of diagnostic outcome relevance. As gene expression is a combination effect, with the help of SVD, the microarray data is decomposed, the eigenvectors correspond to the biological effect of clinical outcomes are identified. Genes which play important roles in determining this biological effect are detected. Therefore, genes are essentially identified in terms of the strength of association with clinical outcomes and the relationship of genes and clinical outcomes is analyzed. Monte Carlo simulations are then used to fine tune the selected gene set in terms of classification accuracy. The approach was tested on four public data sets. Comparative studies show that the selected genes achieved higher classification accuracies. Graphical analysis visualizes that they have close relationship with the cancer class. Statistical simulation shows that the gene set found by the proposed method is also less variable and comparatively invariant to external influences. The biological relevance of the selected genes is further discussed and validated with the literature study and analysis of biological databases.