Using model trees and their ensembles for imbalanced data

  • Authors:
  • Juan J. Rodríguez;José F. Díez-Pastor;César García-Osorio;Pedro Santos

  • Affiliations:
  • University of Burgos, Spain;University of Burgos, Spain;University of Burgos, Spain;University of Burgos, Spain

  • Venue:
  • CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Model trees are decision trees with linear regression functions at the leaves. Although originally proposed for regression, they have also been applied successfully in classification problems. This paper studies their performance for imbalanced problems. These trees give better results that standard decision trees (J48, based on C4.5) and decision trees specific for imbalanced data (CCPDT: Class Confidence Proportion Decision Trees). Moreover, different ensemble methods are considered using these trees as base classifiers: Bagging, Random Subspaces, AdaBoost, MultiBoost, LogitBoost and specific methods for imbalanced data: Random Undersampling and SMOTE. Ensembles of Model Trees also give better results than ensembles of the other considered trees.