C4.5: programs for machine learning
C4.5: programs for machine learning
Neural networks: a systematic introduction
Neural networks: a systematic introduction
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient decision tree construction on streaming data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Decision Tree Evolution Using Limited Number of Labeled Data Items from Drifting Data Streams
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
ACM SIGMOD Record
Decision trees for mining data streams
Intelligent Data Analysis
On Appropriate Assumptions to Mine Data Streams: Analysis and Practice
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Ambiguous decision trees for mining concept-drifting data streams
Pattern Recognition Letters
New options for hoeffding trees
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Nonconvex Online Support Vector Machines
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learn++: an incremental learning algorithm for supervised neuralnetworks
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Adaptive probabilistic neural networks for pattern classification in time-varying environment
IEEE Transactions on Neural Networks
Incremental Learning From Stream Data
IEEE Transactions on Neural Networks - Part 1
Decision Trees for Mining Data Streams Based on the McDiarmid's Bound
IEEE Transactions on Knowledge and Data Engineering
Decision Trees for Mining Data Streams Based on the Gaussian Approximation
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.07 |
One of the most popular tools for mining data streams are decision trees. In this paper we propose a new algorithm, which is based on the commonly known CART algorithm. The most important task in constructing decision trees for data streams is to determine the best attribute to make a split in the considered node. To solve this problem we apply the Gaussian approximation. The presented algorithm allows to obtain high accuracy of classification, with a short processing time. The main result of this paper is the theorem showing that the best attribute computed in considered node according to the available data sample is the same, with some high probability, as the attribute derived from the whole data stream.