The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study

  • Authors:
  • Xu-Ying Liu;Zhi-Hua Zhou

  • Affiliations:
  • Nanjing University, China;Nanjing University, China

  • Venue:
  • ICDM '06 Proceedings of the Sixth International Conference on Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In real-world applications the number of examples in one class may overwhelm the other class, but the primary interest is usually on the minor class. Cost-sensitive learning has been deeded as a good solution to these class-imbalanced tasks, yet it is not clear how does the class-imbalance affect cost-sensitive classifiers. This paper presents an empirical study using 38 data sets, which discloses that class-imbalance often affects the performance of cost-sensitive classifiers: When the misclassification costs are not seriously unequal, cost-sensitive classifiers generally favor natural class distribution although it might be imbalanced; while when misclassification costs are seriously unequal, a balanced class distribution is more favorable.