C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
MultiBoosting: A Technique for Combining Boosting and Wagging
Machine Learning
Generalized feature extraction for structural pattern recognition in time-series data
Generalized feature extraction for structural pattern recognition in time-series data
Rotation Forest: A New Classifier Ensemble Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Automatically countering imbalance and its empirical relationship to cost
Data Mining and Knowledge Discovery
Learning Decision Trees for Unbalanced Data
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Exploratory undersampling for class-imbalance learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An experimental study on rotation forest ensembles
MCS'07 Proceedings of the 7th international conference on Multiple classifier systems
Generating diverse ensembles to counter the problem of class imbalance
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
RUSBoost: A Hybrid Approach to Alleviating Class Imbalance
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Learning ensemble classifiers via restricted Boltzmann machines
Pattern Recognition Letters
Hi-index | 0.01 |
Ensembles of decision trees are considered for imbalanced datasets. Conventional decision trees (C4.5) and trees for imbalanced data (CCPDT: Class Confidence Proportion Decision Tree) are used as base classifiers. Ensemble methods, based on undersampling and oversampling, for imbalanced data are considered. Conventional ensemble methods, not specific for imbalanced data, are also studied: Bagging, Random Subspaces, AdaBoost, Real AdaBoost, MultiBoost and Rotation Forest. The results show that the ensemble method is much more important that the type of decision trees used as base classifier. Rotation Forest is the ensemble method with the best results. For the decision tree methods, CCPDT shows no advantage.