Gene and sample selection for cancer classification with support vectors based t-statistic

Authors:
Piyushkumar A. Mundra;Jagath C. Rajapakse
Affiliations:
Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, Singapore;Bioinformatics Research Center, School of Computer Engineering, Nanyang Technological University, Singapore and Singapore-MIT Alliance, Singapore and Department of Biological Engineering, Massachu ...
Venue:
Neurocomputing
Year:
2010

Citing 16
Cited 4

Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
A selective sampling approach to active feature selection

Artificial Intelligence
Dimension Reduction-Based Penalized Logistic Regression for Cancer Classification Using Microarray Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
LS Bound based gene selection for DNA microarray data

Bioinformatics
The impact of sample reduction on PCA-based feature extraction for supervised learning

Proceedings of the 2006 ACM symposium on Applied computing
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Analysis of recursive gene selection approaches from microarray data

Bioinformatics
The peaking phenomenon in the presence of feature-selection

Pattern Recognition Letters
Performance of feature-selection methods in the classification of high-dimension data

Pattern Recognition
Support Vector Based T-Score for Gene Ranking

PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Selecting marker genes for cancer classification using supervised weighted kernel clustering and the support vector machine

Computational Statistics & Data Analysis
Genetic algorithms for simultaneous variable and sample selection in metabonomics

Bioinformatics
Gene selection from microarray data for cancer classification-a machine learning approach

Computational Biology and Chemistry
Relevant, irredundant feature selection and noisy example elimination

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

A framework of gene subset selection using multiobjective evolutionary algorithm

PRIB'12 Proceedings of the 7th IAPR international conference on Pattern Recognition in Bioinformatics
Diagnose the premalignant pancreatic cancer using high dimensional linear machine

PRIB'12 Proceedings of the 7th IAPR international conference on Pattern Recognition in Bioinformatics
Nonnegative Least-Squares Methods for the Classification of High-Dimensional Biological Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Simultaneous sample and gene selection using t-score and approximate support vectors

PRIB'13 Proceedings of the 8th IAPR international conference on Pattern Recognition in Bioinformatics

Quantified Score

Hi-index	0.01

Visualization

Abstract

T-statistic is widely used for gene ranking in the analysis of microarray gene expressions. Such a filter based criterion is generally computed using all the training samples, all of which, however, may not be equally important for classification task. In this paper, we decompose the t-statistic into two parts, corresponding to relevant and irrelevant data points. The relevant data points are selected using support vectors and then used to compute t-statistic for feature selection. By simultaneously selecting data points and genes, significantly better classification results are achieved on synthetic as well as on several benchmark cancer datasets.