Data-dependent margin-based generalization bounds for classification

  • Authors:
  • András Antos;Balázs Kégl;Tamás Linder;Gábor Lugosi

  • Affiliations:
  • Department of Mathematics and Statistics, Queen's University, Kingston, Ontario, Canada K7L 3N6;Department of Computer Science and Operational Research, University of Montreal, C.P. 6128 Succ. Centre-Ville, Canada, H3C 3J7;Department of Mathematics and Statistics, Queen's University, Kingston, Ontario, Canada K7L 3N6;Department of Economics, Pompeu Fabra University, Ramon Trias Fargas 25-27, 08005 Barcelona, Spain

  • Venue:
  • The Journal of Machine Learning Research
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We derive new margin-based inequalities for the probability of error of classifiers. The main feature of these bounds is that they can be calculated using the training data and therefore may be effectively used for model selection purposes. In particular, the bounds involve empirical complexities measured on the training data (such as the empirical fat-shattering dimension) as opposed to their worst-case counterparts traditionally used in such analyses. Also, our bounds appear to be sharper and more general than recent results involving empirical complexity measures. In addition, we develop an alternative data-based bound for the generalization error of classes of convex combinations of classifiers involving an empirical complexity measure that is easier to compute than the empirical covering number or fat-shattering dimension. We also show examples of efficient computation of the new bounds.