The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
Evolving Connectionist Systems: Methods and Applications in Bioinformatics, Brain Study and Intelligent Machines
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
A fast and elitist multiobjective genetic algorithm: NSGA-II
IEEE Transactions on Evolutionary Computation
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
High dimensionality two-class biomarker data (e.g. microarray and proteomics data with few samples but large numbers of variables) is often difficult to classify. Many currently used methods cannot easily deal with unbalanced datasets (when the number of samples in class 1 and class 2 are very different). This problem can be alleviated by the following new method: first, sample data space by recursive partitions, then use two-class support vector machine tree (2-SVMT) for classification. Recursive partitioning divides the feature space into more manageable portions, from which informative features are more easily found by 2-SVMT. Using two-class microarray and proteomics data for cancer diagnostics, we demonstrate that 2-SVMT results in higher classification accuracy and especially more consistent classification of various datasets than standard SVM, KNN or C4.5. The advantage of the method is its super robustness for class unbalanced datasets.