Error-Based Pruning of Decision Trees Grown on Very Large Data Sets Can Work!

Authors:
Lawrence O. Hall;Richard Collins;Kevin W. Bowyer;Robert Banfield
Affiliations:
-;-;-;-
Venue:
ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Year:
2002

Citing 0
Cited 3

Evolutionary stratified training set selection for extracting classification rules with trade off precision-interpretability

Data & Knowledge Engineering
Feature selection and classification model construction on type 2 diabetic patients' data

Artificial Intelligence in Medicine
An optimization of ReliefF for classification in large datasets

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

It has been asserted that, using traditional pruning methods, growing decision trees with increasingly larger amounts of training data will result in larger tree sizes even when accuracy does not increase. With regard to error-based pruning, the experimental data used to illustrate this assertion have apparently been obtained using the default setting for pruning strength; in particular, using the default certainty factor of 25 in the C4.5 decision tree implementation. We show that, in general, an appropriate setting of the certainty factor for error-based pruning will cause decision tree size to plateau when accuracy is not increasing with more training data.