Suboptimality of penalized empirical risk minimization in classification

  • Authors:
  • Guillaume Lecué

  • Affiliations:
  • Laboratoire de Probabilités et Modèles Aléatoires, UMR, CNRS, Université Paris VI, Paris, France

  • Venue:
  • COLT'07 Proceedings of the 20th annual conference on Learning theory
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let F be a set of M classification procedures with values in [-1, 1]. Given a loss function, we want to construct a procedure which mimics at the best possible rate the best procedure in F. This fastest rate is called optimal rate of aggregation. Considering a continuous scale of loss functions with various types of convexity, we prove that optimal rates of aggregation can be either ((logM)/n)1/2 or (logM)/n. We prove that, if all the M classifiers are binary, the (penalized) Empirical Risk Minimization procedures are suboptimal (even under the margin/low noise condition) when the loss function is somewhat more than convex, whereas, in that case, aggregation procedures with exponential weights achieve the optimal rate of aggregation.