An Efficient Decision Tree Classification Method Based on Extended Hash Table for Data Streams Mining

  • Authors:
  • Zhenzheng Ouyang;Quanyuan Wu;Tao Wang

  • Affiliations:
  • -;-;-

  • Venue:
  • FSKD '08 Proceedings of the 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 05
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper focuses on continuous attributes handling for mining data stream with concept drift. Data stream is an incremental, online and real time model. Domingos and Hulten have presented a one-pass algorithm. Their system VFDT use Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. VFDT’s extended version CVFDT handles concept drift efficiently. In this paper, we revisit this problem and implemented a system HashCVFDT on top of CVFDT. It is as fast as hash table when inserting, seeking or deleting attribute value, and it also can sort the attribute value.