k-norm misclassification rate estimation for decision trees

  • Authors:
  • Mingyu Zhong;Michael Georgiopoulos;Georgios C. Anagnostopoulos

  • Affiliations:
  • University of Central Florida, Orlando, FL;University of Central Florida, Orlando, FL;Florida Institute of Technology, Melbourne, FL

  • Venue:
  • ASC '07 Proceedings of The Eleventh IASTED International Conference on Artificial Intelligence and Soft Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The decision tree classifier is a well-known methodology for classification. It is widely accepted that a fully grown tree is usually over-fit to the training data and thus should be pruned back. In this paper, we analyze the overtraining issue theoretically using an the k-norm risk estimation approach with Lidstone's Estimate. Our analysis allows the deeper understanding of decision tree classifiers, especially on how to estimate their misclassification rates using our equations. We propose a simple pruning algorithm based on our analysis and prove its superior properties, including its independence from validation and its efficiency.