Using model trees and their ensembles for imbalanced data

Authors:
Juan J. Rodríguez;José F. Díez-Pastor;César García-Osorio;Pedro Santos
Affiliations:
University of Burgos, Spain;University of Burgos, Spain;University of Burgos, Spain;University of Burgos, Spain
Venue:
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Year:
2011

Citing 17
Cited 0

Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Using Model Trees for Classification

Machine Learning
MultiBoosting: A Technique for Combining Boosting and Wagging

Machine Learning
Generalized feature extraction for structural pattern recognition in time-series data

Generalized feature extraction for structural pattern recognition in time-series data
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Learning Decision Trees for Unbalanced Data

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Measuring classifier performance: a coherent alternative to the area under the ROC curve

Machine Learning
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Exploratory undersampling for class-imbalance learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Generating diverse ensembles to counter the problem of class imbalance

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
RUSBoost: A Hybrid Approach to Alleviating Class Imbalance

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Quantified Score

Hi-index	0.00

Visualization

Abstract

Model trees are decision trees with linear regression functions at the leaves. Although originally proposed for regression, they have also been applied successfully in classification problems. This paper studies their performance for imbalanced problems. These trees give better results that standard decision trees (J48, based on C4.5) and decision trees specific for imbalanced data (CCPDT: Class Confidence Proportion Decision Trees). Moreover, different ensemble methods are considered using these trees as base classifiers: Bagging, Random Subspaces, AdaBoost, MultiBoost, LogitBoost and specific methods for imbalanced data: Random Undersampling and SMOTE. Ensembles of Model Trees also give better results than ensembles of the other considered trees.