RAMOBoost: ranked minority oversampling in boosting

Authors:
Sheng Chen;Haibo He;Edwardo A. Garcia
Affiliations:
Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ;Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI;Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ
Venue:
IEEE Transactions on Neural Networks
Year:
2010

Citing 35
Cited 8

A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Machine Learning for the Detection of Oil Spills in Satellite Radar Images

Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments

Machine Learning
An Instance-Weighting Method to Induce Cost-Sensitive Trees

IEEE Transactions on Knowledge and Data Engineering
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Aligning Boundary in Kernel Space for Learning Imbalanced Dataset

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution

IEEE Transactions on Knowledge and Data Engineering
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem

IEEE Transactions on Knowledge and Data Engineering
Learning concepts from large scale imbalanced data sets using support cluster machines

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Boosted Classification Trees and Class Probability/Quantile Estimation

The Journal of Machine Learning Research
Asymmetric boosting

Proceedings of the 24th international conference on Machine learning
Active learning for class imbalance problem

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Learning on the border: active learning in imbalanced data classification

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Automatically countering imbalance and its empirical relationship to cost

Data Mining and Knowledge Discovery
Learning from Imbalanced Data

IEEE Transactions on Knowledge and Data Engineering
Error detection and impact-sensitive instance ranking in noisy datasets

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Measuring classifier performance: a coherent alternative to the area under the ROC curve

Machine Learning
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction

Journal of Artificial Intelligence Research
Learn++.NC: combining ensemble of classifiers with dynamically weighted consult-and-vote for efficient incremental learning of new classes

IEEE Transactions on Neural Networks
Constructing ensembles of classifiers by means of weighted instance selection

IEEE Transactions on Neural Networks
Boosting through optimization of margin distributions

IEEE Transactions on Neural Networks
Sparse approximation through boosting for learning large scale kernel machines

IEEE Transactions on Neural Networks
Sensitivity versus accuracy in multiclass problems using memetic Pareto evolutionary neural networks

IEEE Transactions on Neural Networks
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
AdaBoost-Based Algorithm for Network Intrusion Detection

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A Kernel-Based Two-Class Classifier for Imbalanced Data Sets

IEEE Transactions on Neural Networks
Face Recognition Using Total Margin-Based Adaptive Fuzzy Support Vector Machines

IEEE Transactions on Neural Networks

A novel synthetic minority oversampling technique for imbalanced data set learning

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Improving ANNs performance on unbalanced data with an AUC-Based learning algorithm

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Dual support vector domain description for imbalanced classification

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part I
Oversampling methods for classification of imbalanced breast cancer malignancy data

ICCVG'12 Proceedings of the 2012 international conference on Computer Vision and Graphics
The fuzzy Laplacianclassifier

Neurocomputing
Handling imbalanced data sets with synthetic boundary data generation using bootstrap re-sampling and AdaBoost techniques

Pattern Recognition Letters
Cost-sensitive decision tree ensembles for effective imbalanced classification

Applied Soft Computing
Imbalanced evolving self-organizing learning

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, learning from imbalanced data has attracted growing attention from both academia and industry due to the explosive growth of applications that use and produce imbalanced data. However, because of the complex characteristics of imbalanced data, many real-world solutions struggle to provide robust efficiency in learning-based applications. In an effort to address this problem, this paper presents Ranked Minority Oversampling in Boosting (RAMOBoost), which is a RAMO technique based on the idea of adaptive synthetic data generation in an ensemble learning system. Briefly, RAMOBoost adaptively ranks minority class instances at each learning iteration according to a sampling probability distribution that is based on the underlying data distribution, and can adaptively shift the decision boundary toward difficult-to-learn minority and majority class instances by using a hypothesis assessment procedure. Simulation analysis on 19 real-world datasets assessed over various metrics--including overall accuracy, precision, recall, F-measure, G-mean, and receiver operation characteristic analysis--is used to illustrate the effectiveness of this method.