An Efficient and Sensitive Decision Tree Approach to Mining Concept-Drifting Data Streams

  • Authors:
  • Cheng-Jung Tsai;Chien-I Lee;Wei-Pang Yang

  • Affiliations:
  • Department of Computer Science, National Chiao Tung University, 1001, Ta Hsueh Rd., Hsinchu 300, Taiwan, Republic of China, e-mail: tsaicj@cis.nctu.edu.tw;Department of Information and Learning Technology, National University of Tainan, 33, Sec. 2, Shu-Lin St. Tainan 700, Taiwan, Republic of China, e-mail: leeci@mail.nutn.edu.tw;Department of Information Management, National Dong Hwa University, No.1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 97401, Taiwan, Republic of China, e-mail: wpyang@mail.ndhu.edu.tw

  • Venue:
  • Informatica
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data stream mining has become a novel research topic of growing interest in knowledge discovery. Most proposed algorithms for data stream mining assume that each data block is basically a random sample from a stationary distribution, but many databases available violate this assumption. That is, the class of an instance may change over time, known as concept drift. In this paper, we propose a Sensitive Concept Drift Probing Decision Tree algorithm (SCRIPT), which is based on the statistical X 2 test, to handle the concept drift problem on data streams. Compared with the proposed methods, the advantages of SCRIPT include: a) it can avoid unnecessary system cost for stable data streams; b) it can immediately and efficiently corrects original classifier while data streams are instable; c) it is more suitable to the applications in which a sensitive detection of concept drift is required.