Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Employing linear regression in regression tree leaves
ECAI '92 Proceedings of the 10th European conference on Artificial intelligence
C4.5: programs for machine learning
C4.5: programs for machine learning
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
Error Estimators for Pruning Regression Trees
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Functional Models for Regression Tree Leaves
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Top-Down Induction of Model Trees with Regression and Splitting Nodes
IEEE Transactions on Pattern Analysis and Machine Intelligence
Incremental learning of linear model trees
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Incremental Learning of Linear Model Trees
Machine Learning
A simple regression based heuristic for learning model trees
Intelligent Data Analysis
Engineering Applications of Artificial Intelligence
Scalable look-ahead linear regression trees
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Soft decision trees: A genetically optimized cluster oriented approach
Expert Systems with Applications: An International Journal
Learning Model Trees from Data Streams
DS '08 Proceedings of the 11th International Conference on Discovery Science
An evolutionary algorithm for global induction of regression trees
ICAISC'10 Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part II
Globally induced model trees: an evolutionary approach
PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part I
Classifier acceleration by imitation
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
Learning model trees from evolving data streams
Data Mining and Knowledge Discovery
Temporal multi-hierarchy smoothing for estimating rates of rare events
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
An evolutionary algorithm for global induction of regression trees with multivariate linear models
ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Mining tolerance regions with model trees
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Scalable regression tree learning on Hadoop using OpenPlanet
Proceedings of third international workshop on MapReduce and its Applications Date
Hinging hyperplane models for multiple predicted variables
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Using turning point detection to obtain better regression trees
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Multivariate convex regression with adaptive partitioning
The Journal of Machine Learning Research
Hi-index | 0.00 |
Developing regression models for large datasets that are both accurate and easy to interpret is a very important data mining problem. Regression trees with linear models in the leaves satisfy both these requirements, but thus far, no truly scalable regression tree algorithm is known. This paper proposes a novel regression tree construction algorithm (SECRET) that produces trees of high quality and scales to very large datasets. At every node, SECRET uses the EM algorithm for Gaussian mixtures to find two clusters in the data and to locally transform the regression problem into a classification problem based on closeness to these clusters. Goodness of split measures, like the gini gain, can then be used to determine the split variable and the split point much like in classification tree construction. Scalability of the algorithm can be achieved by employing scalable versions of the EM and classification tree construction algorithms. An experimental evaluation on real and artificial data shows that SECRET has accuracy comparable to other linear regression tree algorithms but takes orders of magnitude less computation time for large datasets.