Selecting marker genes for cancer classification using supervised weighted kernel clustering and the support vector machine

Authors:
Jooyong Shim;Insuk Sohn;Sujong Kim;Jae Won Lee;Paul E. Green;Changha Hwang
Affiliations:
Department of Applied Statistics, Catholic University of Daegu, Kyungbuk 712-702, Republic of Korea;Department of Statistics, Korea University, Seoul 136-701, Republic of Korea;Department of Biochemistry, College of Medicine, Hanyang University, Seoul 133-791, Republic of Korea;Department of Statistics, Korea University, Seoul 136-701, Republic of Korea;Transportation Research Institute, University of Michigan, Ann Arbor, 48109-2150, USA;Division of Information and Computer Science, Dankook University, Gyeonggido 448-701, Republic of Korea
Venue:
Computational Statistics & Data Analysis
Year:
2009

Citing 7
Cited 6

The bias-variance tradeoff and the randomized GACV

Proceedings of the 1998 conference on Advances in neural information processing systems II
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
BagBoosting for tumor classification with gene expression data

Bioinformatics
Multiclass cancer classification and biomarker discovery using GA-based algorithms

Bioinformatics
Structured polychotomous machine diagnosis of multiple cancer types using gene expression

Bioinformatics
Gene selection using support vector machines with non-convex penalty

Bioinformatics
Classification of gene functions using support vector machine for time-course gene expression data

Computational Statistics & Data Analysis

Editorial: Statistical genetics & statistical genomics: Where biology, epistemology, statistics, and computation collide

Computational Statistics & Data Analysis
Improved wavelet neural network for early diagnosis of cancer patients using microarray gene expression data

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Bayesian classification for bivariate normal gene expression

Computational Statistics & Data Analysis
Gene and sample selection for cancer classification with support vectors based t-statistic

Neurocomputing
Non-parametric detection of meaningless distances in high dimensional data

Statistics and Computing
A fuzzy intelligent approach to the classification problem in gene expression data analysis

Knowledge-Based Systems

Quantified Score

Hi-index	0.03

Visualization

Abstract

Due to recent interest in the analysis of DNA microarray data, new methods have been considered and developed in the area of statistical classification. In particular, according to the gene expression profile of existing data, the goal is to classify the sample into a relevant diagnostic category. However, when classifying outcomes into certain cancer types, it is often the case that some genes are not important, while some genes are more important than others. A novel algorithm is presented for selecting such relevant genes referred to as marker genes for cancer classification. This algorithm is based on the Support Vector Machine (SVM) and Supervised Weighted Kernel Clustering (SWKC). To investigate the performance of this algorithm, the methods were applied to a simulated data set and some real data sets. For comparison, some other well-known methods such as Prediction Analysis of Microarrays (PAM), Support Vector Machine-Recursive Feature Elimination (SVM-RFE), and a Structured Polychotomous Machine (SPM) were considered. The experimental results indicate that the proposed SWKC/SVM algorithm is conceptually much simpler and performs more efficiently than other existing methods used in identifying marker genes for cancer classification. Furthermore, the SWKC/SVM algorithm has the advantage that it requires much less computing time compared with the other existing methods.