RWS (random walk splitting): a random walk based discretization of continuous attributes

  • Authors:
  • Masaaki Hanaoka;Masaki Kobayashi;Haruaki Yamazaki

  • Affiliations:
  • Faculty of Engineering, Yamanashi University, Yamanashi, Japan;Faculty of Engineering, Yamanashi University, Yamanashi, Japan;Faculty of Engineering, Yamanashi University, Yamanashi, Japan

  • Venue:
  • PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The discretization of continuous attributes in a given training set is an important issue, which significantly affects the performance of decision trees. This paper proposes a method to discretize the continuous attributes based on a random walk modeled statistical test. In this method, the algorithm tries to find the point which divides the training set T into two groups T1 and T2 such that T = T1 ∪ T2 with possibly many instances from a majority class included in T1. In other words, the algorithm detects the splitting point, which gives the maximum discrepancy between the two empirical distributions, the majority class and the rest. The algorithm recursively executes this procedure until some statistical criterion is satisfied. Further, we report the effectiveness of the algorithm over ChiMerge and MDLPC based on an experiment with UCI repository.