A novel Chi2 algorithm for discretization of continuous attributes

  • Authors:
  • Wenyu Qu;Deqian Yan;Yu Sang;Hongxia Liang;Masaru Kitsuregawa;Keqiu Li

  • Affiliations:
  • School of Computer Science and Technology, Dalian Maritime University, Dalian, China and Institute of Industrial Science, The University of Tokyo, Tokyo, Japan;Department of Computer Science and Engineering, Liaoning Normal University, Dalian, China;Department of Computer Science and Engineering, Liaoning Normal University, Dalian, China;Department of Computer Science and Engineering, Liaoning Normal University, Dalian, China;Institute of Industrial Science, The University of Tokyo, Tokyo, Japan;Department of Computer Science and Engineering, Dalian University of Technology, Dalian, China

  • Venue:
  • APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Chi2 algorithm, together with the Modified Chi2 algorithm and the Extended Chi2 algorithm, is famous for discretization algorithms with the base of probability and statistics. After studying these algorithms and analyzing their drawbacks, we present a new Chi2 algorithm called Rectified Chi2 algorithm, which regards a new merging standard as the basis of interval merging and discretizes the real value attributes exactly and reasonably. We also present a new sequence method (DSM) to overcome of the drawbacks that the Modified Chi2 algorithm and the Extended Chi2, i.e., they only adopt the maximal difference as standard of interval merger. We evaluate the performance of the Rectified Chi2 algorithm and DSM over extensive experiments. The experiment results show the effectiveness of the proposed algorithms.