Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Top-Down Induction of Clustering Trees
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Incremental Learning of Linear Model Trees
Machine Learning
Kernelizing the output of tree-based methods
ICML '06 Proceedings of the 23rd international conference on Machine learning
Stepwise Induction of Multi-target Model Trees
ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning Model Trees from Data Streams
DS '08 Proceedings of the 11th International Conference on Discovery Science
Learning model trees from evolving data streams
Data Mining and Knowledge Discovery
Constraint based induction of multi-objective regression trees
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Hi-index | 0.00 |
As in batch learning, one may identify a class of streaming real-world problems which require the modeling of several targets simultaneously. Due to the dependencies among the targets, simultaneous modeling can be more successful and informative than creating independent models for each target. As a result one may obtain a smaller model able to simultaneously explain the relations between the input attributes and the targets. This problem has not been addressed previously in the streaming setting. We propose an algorithm for inducing multi-target model trees with low computational complexity, based on the principles of predictive clustering trees and probability bounds for supporting splitting decisions. Linear models are computed for each target separately, by incremental training of perceptrons in the leaves of the tree. Experiments are performed on synthetic and real-world datasets. The multi-target regression tree algorithm produces equally accurate and smaller models for simultaneous prediction of all the target attributes, as compared to a set of independent regression trees built separately for each target attribute. When the regression surface is smooth, the linear models computed in the leaves significantly improve the accuracy for all of the targets.