A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments
Machine Learning
An Instance-Weighting Method to Induce Cost-Sensitive Trees
IEEE Transactions on Knowledge and Data Engineering
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Aligning Boundary in Kernel Space for Learning Imbalanced Dataset
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution
IEEE Transactions on Knowledge and Data Engineering
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
Learning concepts from large scale imbalanced data sets using support cluster machines
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Boosted Classification Trees and Class Probability/Quantile Estimation
The Journal of Machine Learning Research
Proceedings of the 24th international conference on Machine learning
Active learning for class imbalance problem
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Learning on the border: active learning in imbalanced data classification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Automatically countering imbalance and its empirical relationship to cost
Data Mining and Knowledge Discovery
IEEE Transactions on Knowledge and Data Engineering
Error detection and impact-sensitive instance ranking in noisy datasets
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
IEEE Transactions on Neural Networks
Constructing ensembles of classifiers by means of weighted instance selection
IEEE Transactions on Neural Networks
Boosting through optimization of margin distributions
IEEE Transactions on Neural Networks
Sparse approximation through boosting for learning large scale kernel machines
IEEE Transactions on Neural Networks
Sensitivity versus accuracy in multiclass problems using memetic Pareto evolutionary neural networks
IEEE Transactions on Neural Networks
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
AdaBoost-Based Algorithm for Network Intrusion Detection
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A Kernel-Based Two-Class Classifier for Imbalanced Data Sets
IEEE Transactions on Neural Networks
Face Recognition Using Total Margin-Based Adaptive Fuzzy Support Vector Machines
IEEE Transactions on Neural Networks
A novel synthetic minority oversampling technique for imbalanced data set learning
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Improving ANNs performance on unbalanced data with an AUC-Based learning algorithm
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Dual support vector domain description for imbalanced classification
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part I
Oversampling methods for classification of imbalanced breast cancer malignancy data
ICCVG'12 Proceedings of the 2012 international conference on Computer Vision and Graphics
Neurocomputing
Cost-sensitive decision tree ensembles for effective imbalanced classification
Applied Soft Computing
Imbalanced evolving self-organizing learning
Neurocomputing
Hi-index | 0.00 |
In recent years, learning from imbalanced data has attracted growing attention from both academia and industry due to the explosive growth of applications that use and produce imbalanced data. However, because of the complex characteristics of imbalanced data, many real-world solutions struggle to provide robust efficiency in learning-based applications. In an effort to address this problem, this paper presents Ranked Minority Oversampling in Boosting (RAMOBoost), which is a RAMO technique based on the idea of adaptive synthetic data generation in an ensemble learning system. Briefly, RAMOBoost adaptively ranks minority class instances at each learning iteration according to a sampling probability distribution that is based on the underlying data distribution, and can adaptively shift the decision boundary toward difficult-to-learn minority and majority class instances by using a hypothesis assessment procedure. Simulation analysis on 19 real-world datasets assessed over various metrics--including overall accuracy, precision, recall, F-measure, G-mean, and receiver operation characteristic analysis--is used to illustrate the effectiveness of this method.