C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Theories for mutagenicity: a study in first-order and feature-based induction
Artificial Intelligence - Special volume on empirical methods
Decision Tree Induction Based on Efficient Tree Restructuring
Machine Learning
Top-down induction of first-order logical decision trees
Artificial Intelligence
Machine Learning
Machine Learning
ECML '93 Proceedings of the European Conference on Machine Learning
ECML '95 Proceedings of the 8th European Conference on Machine Learning
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Top-Down Induction of Clustering Trees
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
ALT '95 Proceedings of the 6th International Conference on Algorithmic Learning Theory
Improving the efficiency of inductive logic programming through the use of query packs
Journal of Artificial Intelligence Research
Evaluation of distance measures for multi-class classification in binary SVM decision tree
ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I
Exploiting code redundancies in ECOC
DS'10 Proceedings of the 13th international conference on Discovery science
Predicting structured outputs k-nearest neighbours method
DS'11 Proceedings of the 14th international conference on Discovery science
Bagging using statistical queries
ECML'06 Proceedings of the 17th European conference on Machine Learning
Learning predictive clustering rules
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Tree ensembles for predicting structured outputs
Pattern Recognition
Multi-target regression with rule ensembles
The Journal of Machine Learning Research
Hi-index | 0.00 |
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. We identify a number of parameters that influence the obtainable speedups, and validate and refine our analysis with experiments on a variety of data sets with two different implementations. Besides cross-validation, we also briefly explore the usefulness of these techniques for bagging. We conclude with some guidelines concerning when these optimizations should be considered.