Efficient incremental induction of decision trees
Machine Learning
Shape quantization and recognition with randomized trees
Neural Computation
Decision Tree Induction Based on Efficient Tree Restructuring
Machine Learning
The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Decision tree classification of spatial data streams using Peano Count Trees
Proceedings of the 2002 ACM symposium on Applied computing
Machine Learning
Better streaming algorithms for clustering problems
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Clustering binary data streams with K-means
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Is random model better? On its accuracy and efficiency
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Sequential Pattern Mining in Multiple Streams
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining maximal frequent itemsets from data streams
Journal of Information Science
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Maximizing tree diversity by building complete-random decision trees
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Mining Concept-Drifting Data Streams with Multiple Semi-Random Decision Trees
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Parameter Estimation in Semi-Random Decision Tree Ensembling on Streaming Data
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
PGG: an online pattern based approach for stream variation management
Journal of Computer Science and Technology
Concept Drifting Detection on Noisy Streaming Data in Random Ensemble Decision Trees
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
Mining with streaming data is a hot topic in data mining. When performing classification on data streams, traditional classification algorithms based on decision trees, such as ID3 and C4.5, have a relatively poor efficiency in both time and space due to the characteristics of streaming data. There are some advantages in time and space when using random decision trees. An incremental algorithm for mining data streams, SRMTDS (Semi-Random Multiple decision Trees for Data Streams), based on random decision trees is proposed in this paper. SRMTDS uses the inequality of Hoeffding bounds to choose the minimum number of split-examples, a heuristic method to compute the information gain for obtaining the split thresholds of numerical attributes, and a Naïve Bayes classifier to estimate the class labels of tree leaves. Our extensive experimental study shows that SRMTDS has an improved performance in time, space, accuracy and the anti-noise capability in comparison with VFDTc, a state-of-the-art decision-tree algorithm for classifying data streams.