Efficient decision tree re-alignment for clustering time-changing data streams

  • Authors:
  • Yingying Tao;M. Tamer Özsu

  • Affiliations:
  • University of Waterloo, Waterloo, Ontario, Canada;University of Waterloo, Waterloo, Ontario, Canada

  • Venue:
  • From active data management to event-based systems and more
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining streaming data has been an active research area to address requirements of applications, such as financial marketing, telecommunication, network monitoring, and so on. A popular technique for mining these continuous and fast-arriving data streams is decision trees. The accuracy of decision trees can deteriorate if the distribution of values in the stream changes over time. In this paper, we propose an approach based on decision trees that can detect distribution changes and re-align the decision tree quickly to reflect the change. The technique exploits a set of synopses on the leaf nodes, which are also used to prune the decision tree. Experimental results demonstrate that the proposed approach can detect the distribution changes in real-time with high accuracy, and re-aligning a decision tree can improve its performance in clustering the subsequent data stream tuples.