Improving the performance of minor class in decision tree using duplicating instances

  • Authors:
  • Hyontai Sug

  • Affiliations:
  • Division of Computer and Information Engineering, Dongseo University, Busan, Republic of Korea

  • Venue:
  • AIKED'11 Proceedings of the 10th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Because decision trees are built to cover all training instances with minimal errors, it is true that the instances that belong to minor classes are treated less importantly in classification. As a result, the classification accuracy for minor classes is usually poorer than that of major classes. But we hope that the classification is also good for the minor classes. This paper suggests to use over-sampling for minor classes to generate more accurate trees for minor classes, and use decision trees with conventional sampling method as well as decision trees with the over sampling method together for better classification. Experiments with a representative decision tree algorithm, C4.5, shows very promising results.