Overfitting Avoidance as Bias

  • Authors:
  • Cullen Schaffer

  • Affiliations:
  • Department of Computer Science, CUNY/Hunter College, 695 Park Avenue, New York, NY 10021. SCHAFFER@MARNA.HUNTER.CUNY.EDU

  • Venue:
  • Machine Learning
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

Strategies for increasing predictive accuracy through selective pruning have been widely adopted by researchers in decision tree induction. It is easy to get the impression from research reports that there are statistical reasons for believing that these overfitting avoidance strategies do increase accuracy and that, as a research community, we are making progress toward developing powerful, general methods for guarding against overfitting in inducing decision trees. In fact, any overfitting avoidance strategy amounts to a form of bias and, as such, may degrade performance instead of improving it. If pruning methods have often proven successful in empirical tests, this is due, not to the methods, but to the choice of test problems. As examples in this article illustrate, overfitting avoidance strategies are not better or worse, but only more or less appropriate to specific application domains. We are not—and cannot be—making progress toward methods both powerful and general.