The CART decision tree for mining data streams

Authors:
Leszek Rutkowski;Maciej Jaworski;Lena Pietruczuk;Piotr Duda
Affiliations:
-;-;-;-
Venue:
Information Sciences: an International Journal
Year:
2014

Citing 18
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Neural networks: a systematic introduction

Neural networks: a systematic introduction
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient decision tree construction on streaming data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Decision Tree Evolution Using Limited Number of Labeled Data Items from Drifting Data Streams

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Mining data streams: a review

ACM SIGMOD Record
Decision trees for mining data streams

Intelligent Data Analysis
On Appropriate Assumptions to Mine Data Streams: Analysis and Practice

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Ambiguous decision trees for mining concept-drifting data streams

Pattern Recognition Letters
New options for hoeffding trees

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Nonconvex Online Support Vector Machines

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learn++: an incremental learning algorithm for supervised neuralnetworks

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Nearest neighbor pattern classification

IEEE Transactions on Information Theory
Adaptive probabilistic neural networks for pattern classification in time-varying environment

IEEE Transactions on Neural Networks
Incremental Learning From Stream Data

IEEE Transactions on Neural Networks - Part 1
Decision Trees for Mining Data Streams Based on the McDiarmid's Bound

IEEE Transactions on Knowledge and Data Engineering
Decision Trees for Mining Data Streams Based on the Gaussian Approximation

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.07

Visualization

Abstract

One of the most popular tools for mining data streams are decision trees. In this paper we propose a new algorithm, which is based on the commonly known CART algorithm. The most important task in constructing decision trees for data streams is to determine the best attribute to make a split in the considered node. To solve this problem we apply the Gaussian approximation. The presented algorithm allows to obtain high accuracy of classification, with a short processing time. The main result of this paper is the theorem showing that the best attribute computed in considered node according to the available data sample is the same, with some high probability, as the attribute derived from the whole data stream.