An empirical study of the behavior of classifiers on imbalanced and overlapped data sets

  • Authors:
  • Vicente García;Jose Sánchez;Ramon Mollineda

  • Affiliations:
  • Lab. Reconocimiento de Patrones, Instituto Tecnológico de Toluca, Metepec, México and Dept. Llenguatges i Sistemes Informátics, Universitat Jaume I, Castelló de la Plana, Spain;Dept. Llenguatges i Sistemes Informátics, Universitat Jaume I, Castelló de la Plana, Spain;Dept. Llenguatges i Sistemes Informátics, Universitat Jaume I, Castelló de la Plana, Spain

  • Venue:
  • CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Class imbalance has been reported as an important obstacle to apply traditional learning algorithms to real-world domains. Recent investigations have questioned whether the imbalance is the unique factor that hinders the performance of classifiers. In this paper, we study the behavior of six algorithms when classifying imbalanced, overlapped data sets under uncommon situations (e.g., when the overall imbalance ratio is different from the local imbalance ratio in the overlap region). This is accomplished by analyzing the accuracy on each individual class, thus devising how those situations affect the majority and minority classes. The experiments corroborate that overlap is more important than imbalance for the classification performance. Also, they show that the classifiers behave differently depending on the nature of each model.