Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data

Authors:
Ka Yee Yeung;Roger E. Bumgarner;Adrian E. Raftery
Affiliations:
Department of Microbiology, University of Washington Seattle, WA 98195, USA;Department of Microbiology, University of Washington Seattle, WA 98195, USA;Department of Statistics, University of Washington Seattle, WA 98195, USA
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 26

On Bayesian classification with Laplace priors

Pattern Recognition Letters
Effective Gene Selection Method With Small Sample Sets Using Gradient-Based and Point Injection Techniques

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Effective Gene Selection Method Using Bayesian Discriminant Based Criterion and Genetic Algorithms

Journal of Signal Processing Systems
Cancer classification by gradient LDA technique using microarray gene expression data

Data & Knowledge Engineering
Ensemble Neural Networks with Novel Gene-Subsets for Multiclass Cancer Classification

Neural Information Processing
An expert system to classify microarray gene expression data using gene selection by decision tree

Expert Systems with Applications: An International Journal
Learning Nondeterministic Classifiers

The Journal of Machine Learning Research
Identification of Full and Partial Class Relevant Genes

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Bioinformatics contributions to data mining

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Model building using bi-level optimization

Journal of Global Optimization
Recursive Mahalanobis Separability Measure for Gene Subset Selection

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Learning Instance-Specific Predictive Models

The Journal of Machine Learning Research
Methods in case-based classification in bioinformatics: lessons learned

ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
Stable Gene Selection from Microarray Data via Sample Weighting

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Applying gaussian distribution-dependent criteria to decision trees for high-dimensional microarray data

VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Missing value imputation framework for microarray significant gene selection and class prediction

BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Collateral missing value estimation: robust missing value estimation for consequent microarray data processing

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Tissue classification using gene expression data and artificial neural network ensembles

ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
On efficient calculations for Bayesian variable selection

Computational Statistics & Data Analysis
Leukemia prediction from gene expression data—a rough set approach

ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
Case based reasoning with bayesian model averaging: an improved method for survival analysis on microarray data

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development
Comparison of reuse strategies for case-based classification in bioinformatics

ICCBR'11 Proceedings of the 19th international conference on Case-Based Reasoning Research and Development
An alternating direction method for finding Dantzig selectors

Computational Statistics & Data Analysis
Prognostic modeling with high dimensional and censored data

ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function

Engineering Applications of Artificial Intelligence
Assessing similarity of feature selection techniques in high-dimensional domains

Pattern Recognition Letters

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Selecting a small number of relevant genes for accurate classification of samples is essential for the development of diagnostic tests. We present the Bayesian model averaging (BMA) method for gene selection and classification of microarray data. Typical gene selection and classification procedures ignore model uncertainty and use a single set of relevant genes (model) to predict the class. BMA accounts for the uncertainty about the best set to choose by averaging over multiple models (sets of potentially overlapping relevant genes). Results: We have shown that BMA selects smaller numbers of relevant genes (compared with other methods) and achieves a high prediction accuracy on three microarray datasets. Our BMA algorithm is applicable to microarray datasets with any number of classes, and outputs posterior probabilities for the selected genes and models. Our selected models typically consist of only a few genes. The combination of high accuracy, small numbers of genes and posterior probabilities for the predictions should make BMA a powerful tool for developing diagnostics from expression data. Availability: The source codes and datasets used are available from our Supplementary website. Contact: kayee@u.washington.edu Supplementary information: http://www.expression.washington.edu/publications/kayee/bma