A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data

Authors:
Sounak Chakraborty;Ruixin Guo
Affiliations:
Department of Statistics, University of Missouri-Columbia, 209F Middlebush Hall, Columbia, MO 65211-6100, USA;Department of Biostatistics, University of North Carolina at Chapel Hill, 3101 McGavran-Greenberg, CB 7420, Chapel Hill, NC 27599, USA
Venue:
Computational Statistics & Data Analysis
Year:
2011

Citing 9
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Bayesian Methods for Support Vector Machines: Evidence and Predictive Class Probabilities

Machine Learning
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
Feature Selection via Concave Minimization and Support Vector Machines

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Optimal number of features as a function of sample size for various classification rules

Bioinformatics
Gene selection using support vector machines with non-convex penalty

Bioinformatics
Hybrid huberized support vector machines for microarray classification and gene selection

Bioinformatics
Bayesian binary kernel probit model for microarray based cancer classification and gene selection

Computational Statistics & Data Analysis
Monte Carlo Statistical Methods

Monte Carlo Statistical Methods

Quantified Score

Hi-index	0.03

Visualization

Abstract

A hybrid Huberized support vector machine (HHSVM) with an elastic-net penalty has been developed for cancer tumor classification based on thousands of gene expression measurements. In this paper, we develop a Bayesian formulation of the hybrid Huberized support vector machine for binary classification. For the coefficients of the linear classification boundary, we propose a new type of prior, which can select variables and group them together simultaneously. Our proposed prior is a scale mixture of normal distributions and independent gamma priors on a transformation of the variance of the normal distributions. We establish a direct connection between the Bayesian HHSVM model with our special prior and the standard HHSVM solution with the elastic-net penalty. We propose a hierarchical Bayes technique and an empirical Bayes technique to select the penalty parameter. In the hierarchical Bayes model, the penalty parameter is selected using a beta prior. For the empirical Bayes model, we estimate the penalty parameter by maximizing the marginal likelihood. The proposed model is applied to two simulated data sets and three real-life gene expression microarray data sets. Results suggest that our Bayesian models are highly successful in selecting groups of similarly behaved important genes and predicting the cancer class. Most of the genes selected by our models have shown strong association with well-studied genetic pathways, further validating our claims.