The nature of statistical learning theory
The nature of statistical learning theory
Pairwise classification and support vector machines
Advances in kernel methods
An introduction to variable and feature selection
The Journal of Machine Learning Research
IEEE Transactions on Pattern Analysis and Machine Intelligence
Working Set Selection Using Second Order Information for Training Support Vector Machines
The Journal of Machine Learning Research
A review of feature selection techniques in bioinformatics
Bioinformatics
Fuzzy-rough sets for information measures and selection of relevant genes from microarray data
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on game theory
Review Article: Stable feature selection for biomarker discovery
Computational Biology and Chemistry
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
Identifying key biomarkers for different cancer types can improve diagnosis accuracy and treatment. Gene expression data can help differentiate between cancer subtypes. However the limitation of having a small number of samples versus a larger number of genes represented in a dataset leads to the overfitting of classification models. Feature selection methods can help select the most distinguishing feature sets for classifying different cancers. A new class dependent feature selection approach integrates the F-statistic, Maximum Relevance Binary Particle Swarm Optimization (MRBPSO) and Class Dependent Multi-category Classification (CDMC) system. This feature selection method combines filter and wrapper based methods. A set of highly differentially expressed genes (features) are pre-selected using the F statistic for each dataset as a filter for selecting the most meaningful features. MRBPSO and CDMC function as a wrapper to select desirable feature subsets for each class and classify the samples using those chosen class-dependent feature subsets. The performance of the proposed methods is evaluated on eight real cancer datasets. The results indicate that the class-dependent approaches can effectively identify biomarkers related to each cancer type and improve classification accuracy compared to class independent feature selection methods.