A novel class dependent feature selection method for cancer biomarker discovery

Authors:
Wengang Zhou;Julie A. Dickerson
Affiliations:
-;-
Venue:
Computers in Biology and Medicine
Year:
2014

Citing 14
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Pairwise classification and support vector machines

Advances in kernel methods
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis

Bioinformatics
Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method

Bioinformatics
Working Set Selection Using Second Order Information for Training Support Vector Machines

The Journal of Machine Learning Research
A review of feature selection techniques in bioinformatics

Bioinformatics
Robust biomarker identification for cancer diagnosis with ensemble feature selection methods

Bioinformatics
Fuzzy-rough sets for information measures and selection of relevant genes from microarray data

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on game theory
Review Article: Stable feature selection for biomarker discovery

Computational Biology and Chemistry
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks
StruLocPred: structure-based protein subcellular localisation prediction using multi-class support vector machine

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Identifying key biomarkers for different cancer types can improve diagnosis accuracy and treatment. Gene expression data can help differentiate between cancer subtypes. However the limitation of having a small number of samples versus a larger number of genes represented in a dataset leads to the overfitting of classification models. Feature selection methods can help select the most distinguishing feature sets for classifying different cancers. A new class dependent feature selection approach integrates the F-statistic, Maximum Relevance Binary Particle Swarm Optimization (MRBPSO) and Class Dependent Multi-category Classification (CDMC) system. This feature selection method combines filter and wrapper based methods. A set of highly differentially expressed genes (features) are pre-selected using the F statistic for each dataset as a filter for selecting the most meaningful features. MRBPSO and CDMC function as a wrapper to select desirable feature subsets for each class and classify the samples using those chosen class-dependent feature subsets. The performance of the proposed methods is evaluated on eight real cancer datasets. The results indicate that the class-dependent approaches can effectively identify biomarkers related to each cancer type and improve classification accuracy compared to class independent feature selection methods.