Selective costing ensemble for handling imbalanced data sets

  • Authors:
  • S. B. Kotsiantis;P. E. Pintelas

  • Affiliations:
  • (Correspd. Tel.: +30 2610 997833/ +30 2610 997313, Fax: +30 2610 997313/ E-mail: sotos@math.upatras.gr) Educational Software Development Laboratory, Department of Mathematics, University of Patras ...;Educational Software Development Laboratory, Department of Mathematics, University of Patras, Greece

  • Venue:
  • International Journal of Hybrid Intelligent Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many real-world problems exhibit skewed class distributions in which almost all cases are allotted to a class and far fewer cases to a smaller, usually more interesting class. A learner induced from an imbalanced data set has, typically, a low error rate for the majority class and an undesirable error rate for the minority class. This paper firstly provides a organized study on the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a proposed selective costing ensemble and it concludes that such a framework can be a more effective solution to the problem. Our method seems to allow improved identification of difficult small class in predictive analysis, while keeping the classification ability of the majority class in an acceptable level.