Several SVM Ensemble Methods Integrated with Under-Sampling for Imbalanced Data Learning

Authors:
Zhiyong Lin;Zhifeng Hao;Xiaowei Yang;Xiaolan Liu
Affiliations:
School of Computer Science and Engineering, South China University of Technology, Guangzhou 510640 and Department of Computer Science, Guangdong Polytechnic Normal University, Guangzhou 510665;Guangdong University of Technology, Guangzhou 510006;College of Science, South China University of Technology, Email: sophyca@yahoo.cn, Guangzhou, 510640;College of Science, South China University of Technology, Email: sophyca@yahoo.cn, Guangzhou, 510640
Venue:
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Year:
2009

Citing 11
Cited 3

Bagging predictors

Machine Learning
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data mining in metric space: an empirical analysis of supervised learning performance criteria

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Asymmetric Bagging and Random Subspace for Support Vector Machines-Based Relevance Feedback in Image Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Empirical analysis of support vector machine ensemble classifiers

Expert Systems with Applications: An International Journal
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Exploratory undersampling for class-imbalance learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Boosting prediction accuracy on imbalanced datasets with SVM ensembles

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification

Speech Communication
Improving protein-ATP binding residues prediction by boosting SVMs with random under-sampling

Neurocomputing
The fuzzy Laplacianclassifier

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Imbalanced data learning (IDL) is one of the most active and important fields in machine learning research. This paper focuses on exploring the efficiencies of four different SVM ensemble methods integrated with under-sampling in IDL. The experimental results on 20 UCI imbalanced datasets show that two new ensemble algorithms proposed in this paper, i.e., CABagE (which is bagging-style) and MABstE (which is boosting-style), can output the SVM ensemble classifiers with better minority-class-recognition abilities than the existing ensemble methods. Further analysis on the experimental results indicates that MABstE has the best overall classification performance, and we believe that this should be attributed to its more robust example-weighting mechanism.