Random ensemble decision trees for learning concept-drifting data streams

  • Authors:
  • Peipei Li;Xindong Wu;Qianhui Liang;Xuegang Hu;Yuhong Zhang

  • Affiliations:
  • School of Computer Science and Information Engineering, Hefei University of Technology, China;School of Computer Science and Information Engineering, Hefei University of Technology and Department of Computer Science, University of Vermont, Vermont;Hewlett-Packard Labs Singapore;School of Computer Science and Information Engineering, Hefei University of Technology, China;School of Computer Science and Information Engineering, Hefei University of Technology, China

  • Venue:
  • PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Few online classification algorithms based on traditional inductive ensembling focus on handling concept drifting data streams while performing well on noisy data. Motivated by this, an incremental algorithm based on random Ensemble Decision Trees for Concept-drifting data streams (EDTC) is proposed in this paper. Three variants of random feature selection are developed to implement split-tests. To better track concept drifts in data streams with noisy data, an improved twothreshold-based drifting detection mechanism is introduced. Extensive studies demonstrate that our algorithm performs very well compared to several known online algorithms based on single models and ensemble models. A conclusion is hence drawn that multiple solutions are provided for learning from concept drifting data streams with noise.