Average-Case Analysis of Classification Algorithms for Boolean Functions and Decision Trees

  • Authors:
  • Tobias Scheffer

  • Affiliations:
  • -

  • Venue:
  • ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We conduct an average-case analysis of the generalization error rate of classification algorithms with finite model classes. Unlike worst-case approaches, we do not rely on bounds that hold for all possible learning problems. Instead, we study the behavior of a learning algorithm for a given problem, taking properties of the problem and the learner into account. The solution depends only on known quantities (e.g., the sample size), and the histogram of error rates in the model class which we determine for the case that the sought target is a randomly drawn Boolean function. We then discuss how the error histogram can be estimated from a given sample and thus show how the analysis can be applied approximately in the more realistic scenario that the target is unknown. Experiments show that our analysis can predict the behavior of decision tree algorithms fairly accurately even if the error histogram is estimated from a sample.