COLT '90 Proceedings of the third annual workshop on Computational learning theory
Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Recursive Aggregation of Estimators by the Mirror Descent Algorithm with Averaging
Problems of Information Transmission
Prediction, Learning, and Games
Prediction, Learning, and Games
Optimal oracle inequality for aggregation of classifiers under low noise condition
COLT'06 Proceedings of the 19th annual conference on Learning Theory
A randomized online learning algorithm for better variance control
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Information Theory and Mixing Least-Squares Regressions
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Let F be a set of M classification procedures with values in [-1, 1]. Given a loss function, we want to construct a procedure which mimics at the best possible rate the best procedure in F. This fastest rate is called optimal rate of aggregation. Considering a continuous scale of loss functions with various types of convexity, we prove that optimal rates of aggregation can be either ((logM)/n)1/2 or (logM)/n. We prove that, if all the M classifiers are binary, the (penalized) Empirical Risk Minimization procedures are suboptimal (even under the margin/low noise condition) when the loss function is somewhat more than convex, whereas, in that case, aggregation procedures with exponential weights achieve the optimal rate of aggregation.