Exact combinatorial bounds on the probability of overfitting for empirical risk minimization

  • Authors:
  • K. V. Vorontsov

  • Affiliations:
  • Dorodnicyn Computing Centre, Russian Academy of Sciences, Moscow, Russia 119333

  • Venue:
  • Pattern Recognition and Image Analysis
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Three general methods for obtaining exact bounds on the probability of overfitting are proposed within statistical learning theory: a method of generating and destroying sets, a recurrent method, and a blockwise method. Six particular cases are considered to illustrate the application of these methods. These are the following model sets of predictors: a pair of predictors, a layer of a Boolean cube, an interval of a Boolean cube, a monotonic chain, a unimodal chain, and a unit neighborhood of the best predictor. For the interval and the unimodal chain, the results of numerical experiments are presented that demonstrate the effects of splitting and similarity on the probability of overfitting.