Rough Sets for Handling Imbalanced Data: Combining Filtering and Rule-based Classifiers

  • Authors:
  • Jerzy Stefanowski;Szymon Wilk

  • Affiliations:
  • Institute of Computing Sciences Poznań University of Technology ul. Piotrowo 2, 60-965 Poznań, Poland. E-mail: {Jerzy.Stefanowski,Szymon.Wilk}@cs.put.poznan.pl;Institute of Computing Sciences Poznań University of Technology ul. Piotrowo 2, 60-965 Poznań, Poland. E-mail: {Jerzy.Stefanowski,Szymon.Wilk}@cs.put.poznan.pl

  • Venue:
  • Fundamenta Informaticae - SPECIAL ISSUE ON CONCURRENCY SPECIFICATION AND PROGRAMMING (CS&P 2005) Ruciane-Nide, Poland, 28-30 September 2005
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper addresses problems of improving performance of rule-based classifiers constructed from imbalanced data sets, i.e., data sets where the minority class of primary importance is under-represented in comparison to majority classes. We introduced two techniques to detect and process inconsistent examples from the majority classes in the boundary between the minority and majority classes. Both these techniques differ in the way of processing inconsistent boundary examples from the majority classes. The first approach removes them, while the other relabels them as belonging to the minority class. The experiments showed that the best results were obtained for the filtering technique, where inconsistent majority class examples were reassigned to the minority class, combined with a classifier composed of decision rules generated by the MODLEM algorithm.