C4.5: programs for machine learning
C4.5: programs for machine learning
Learning in the presence of concept drift and hidden contexts
Machine Learning
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A streaming ensemble algorithm (SEA) for large-scale classification
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
The Journal of Machine Learning Research
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 2004 ACM symposium on Applied computing
Learning drifting concepts: Example selection vs. example weighting
Intelligent Data Analysis
Real-time ranking with concept drift using expert advice
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient instance-based learning on data streams
Intelligent Data Analysis
Dynamic Weighted Majority: An Ensemble Method for Drifting Concepts
The Journal of Machine Learning Research
Learning Higher Accuracy Decision Trees from Concept Drifting Data Streams
IEA/AIE '08 Proceedings of the 21st international conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: New Frontiers in Applied Artificial Intelligence
Decision Tree Induction from Numeric Data Stream
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
CBDT: A Concept Based Approach to Data Stream Mining
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Preface: an overview on learning from data streams
New Generation Computing
Measuring evolving data streams' behavior through their intrinsic dimension
New Generation Computing
Ambiguous decision trees for mining concept-drifting data streams
Pattern Recognition Letters
Efficient decision tree construction for mining time-varying data streams
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
An efficient algorithm for instance-based learning on data streams
ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Maintaining optimal multi-way splits for numerical attributes in data streams
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Efficient decision tree re-alignment for clustering time-changing data streams
From active data management to event-based systems and more
Learning with local drift detection
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
RCD: A recurring concept drift framework
Pattern Recognition Letters
Hi-index | 0.00 |
This paper presents a system for induction of forest of functional trees from data streams able to detect concept drift. The Ultra Fast Forest of Trees (UFFT) is an incremental algorithm, that works online, processing each example in constant time, and performing a single scan over the training examples. It uses analytical techniques to choose the splitting criteria, and the information gain to estimate the merit of each possible splitting-test. For multi-class problems the algorithm grows a binary tree for each possible pair of classes, leading to a forest of trees. Decision nodes and leaves contain naive-Bayes classifiers playing different roles during the induction process. Naive-Bayes in leaves are used to classify test examples, naive-Bayes in inner nodes can be used as multivariate splitting-tests if chosen by the splitting criteria, and used to detect drift in the distribution of the examples that traverse the node. When a drift is detected, all the sub-tree rooted at that node will be pruned. The use of naive-Bayes classifiers at leaves to classify test examples, the use of splitting-tests based on the outcome of naive-Bayes, and the use of naive-Bayes classifiers at decision nodes to detect drift are directly obtained from the sufficient statistics required to compute the splitting criteria, without no additional computations. This aspect is a main advantage in the context of high-speed data streams. This methodology was tested with artificial and real-world data sets. The experimental results show a very good performance in comparison to a batch decision tree learner, and high capacity to detect and react to drift.