Investigation and Reduction of Discretization Variance in Decision Tree Induction

Authors:
Pierre Geurts;Louis Wehenkel
Affiliations:
-;-
Venue:
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Year:
2000

Citing 6
Cited 5

A Distance-Based Attribute Selection Measure for Decision Tree Induction

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
A statistical approach to decision tree modeling

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Automatic Learning Techniques in Power Systems

Automatic Learning Techniques in Power Systems
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
A Recursive Partitioning Decision Rule for Nonparametric Classification

IEEE Transactions on Computers

Some Enhencements of Decision Tree Bagging

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Using Resampling Techniques for Better Quality Discretization

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Dynamic successive feed-forward neural network for learning fuzzy decision tree

RSFDGrC'11 Proceedings of the 13th international conference on Rough sets, fuzzy sets, data mining and granular computing
Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting

Environmental Modelling & Software
On the interplay of machine learning and background knowledge in image interpretation by Bayesian networks

Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on the variance introduced by the discretization techniques used to handle continuous attributes in decision tree induction. Different discretization procedures are first studied empirically, then means to reduce the discretization variance are proposed. The experiment shows that discretization variance is large and that it is possible to reduce it significantly without notable computational costs. The resulting variance reduction mainly improves interpretability and stability of decision trees, and marginally their accuracy.