Two-Class SVM trees (2-SVMT) for biomarker data analysis

Authors:
Shaoning Pang;Ilkka Havukkala;Nikola Kasabov
Affiliations:
Knowledge Engineering & Discover Research Institute, Auckland University of Technology, Auckland, New Zealand;Knowledge Engineering & Discover Research Institute, Auckland University of Technology, Auckland, New Zealand;Knowledge Engineering & Discover Research Institute, Auckland University of Technology, Auckland, New Zealand
Venue:
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
Year:
2006

Citing 6
Cited 0

The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Evolving Connectionist Systems: Methods and Applications in Bioinformatics, Brain Study and Intelligent Machines

Evolving Connectionist Systems: Methods and Applications in Bioinformatics, Brain Study and Intelligent Machines
Outcome signature genes in breast cancer: is there a unique set?

Bioinformatics
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
A fast and elitist multiobjective genetic algorithm: NSGA-II

IEEE Transactions on Evolutionary Computation
Face membership authentication using SVM classification tree generated by membership-based LLE data partition

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

High dimensionality two-class biomarker data (e.g. microarray and proteomics data with few samples but large numbers of variables) is often difficult to classify. Many currently used methods cannot easily deal with unbalanced datasets (when the number of samples in class 1 and class 2 are very different). This problem can be alleviated by the following new method: first, sample data space by recursive partitions, then use two-class support vector machine tree (2-SVMT) for classification. Recursive partitioning divides the feature space into more manageable portions, from which informative features are more easily found by 2-SVMT. Using two-class microarray and proteomics data for cancer diagnostics, we demonstrate that 2-SVMT results in higher classification accuracy and especially more consistent classification of various datasets than standard SVM, KNN or C4.5. The advantage of the method is its super robustness for class unbalanced datasets.