The Biases of Decision Tree Pruning Strategies

  • Authors:
  • Tapio Elomaa

  • Affiliations:
  • -

  • Venue:
  • IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Post pruning of decision trees has been a successful approach in many real-world experiments, but over all possible concepts it does not bring any inherent improvement to an algorithm's performance. This work explores how a PAC-proven decision tree learning algorithm fares in comparison with two variants of the normal top-down induction of decision trees. The algorithm does not prune its hypothesis per se, but it can be understood to do pre-pruning of the evolving tree. We study a backtracking search algorithm, called Rank, for learning rank-minimal decision trees. Our experiments follow closely those performed by Schaffer [20]. They confirm the main findings of Schaffer: in learning concepts with simple description pruning works, for concepts with a complex description and when all concepts are equally likely pruning is injurious, rather than beneficial, to the average performance of the greedy top-down induction of decision trees. Pre-pruning, as a gentler technique, settles in the average performance in the middle ground between not pruning at all and post pruning.