C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Tree Induction for Probability-Based Ranking
Machine Learning
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Using AUC and Accuracy in Evaluating Learning Algorithms
IEEE Transactions on Knowledge and Data Engineering
KBA: Kernel Boundary Alignment Considering Imbalanced Data Distribution
IEEE Transactions on Knowledge and Data Engineering
Rotation Forest: A New Classifier Ensemble Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Classifier Ensembles with a Random Linear Oracle
IEEE Transactions on Knowledge and Data Engineering
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
The class imbalance problem: A systematic study
Intelligent Data Analysis
Top 10 algorithms in data mining
Knowledge and Information Systems
Automatically countering imbalance and its empirical relationship to cost
Data Mining and Knowledge Discovery
On the k-NN performance in a challenging scenario of imbalance and overlapping
Pattern Analysis & Applications - Special Issue: Non-parametric distance-based classification techniques and their applications
KEEL: a software tool to assess evolutionary algorithms for data mining problems
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy
Evolutionary Computation
Evolutionary sampling and software quality modeling of high-assurance systems
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Exploratory undersampling for class-imbalance learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Pattern Recognition Letters
Artificial Intelligence Review
Information Sciences: an International Journal
Random projections for linear SVM ensembles
Applied Intelligence
Adaptive ROC-based ensembles of HMMs applied to anomaly detection
Pattern Recognition
A review on automatic image annotation techniques
Pattern Recognition
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study
IEEE Transactions on Pattern Analysis and Machine Intelligence
RUSBoost: A Hybrid Approach to Alleviating Class Imbalance
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Cost-sensitive decision trees applied to medical data
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Relationships between Diversity of Classification Ensembles and Single-Class Performance Measures
IEEE Transactions on Knowledge and Data Engineering
A Bound on Kappa-Error Diagrams for Analysis of Classifier Ensembles
IEEE Transactions on Knowledge and Data Engineering
A theory of multiclass boosting
The Journal of Machine Learning Research
Multi-class boosting with asymmetric binary weak-learners
Pattern Recognition
GSVM: An SVM for handling imbalanced accuracy between classes inbi-classification problems
Applied Soft Computing
Hi-index | 0.01 |
Classification with imbalanced data-sets has become one of the most challenging problems in Data Mining. Being one class much more represented than the other produces undesirable effects in both the learning and classification processes, mainly regarding the minority class. Such a problem needs accurate tools to be undertaken; lately, ensembles of classifiers have emerged as a possible solution. Among ensemble proposals, the combination of Bagging and Boosting with preprocessing techniques has proved its ability to enhance the classification of the minority class. In this paper, we develop a new ensemble construction algorithm (EUSBoost) based on RUSBoost, one of the simplest and most accurate ensemble, which combines random undersampling with Boosting algorithm. Our methodology aims to improve the existing proposals enhancing the performance of the base classifiers by the usage of the evolutionary undersampling approach. Besides, we promote diversity favoring the usage of different subsets of majority class instances to train each base classifier. Centered on two-class highly imbalanced problems, we will prove, supported by the proper statistical analysis, that EUSBoost is able to outperform the state-of-the-art methods based on ensembles. We will also analyze its advantages using kappa-error diagrams, which we adapt to the imbalanced scenario.