Generalization Performance of Classifiers in Terms of Observed Covering Numbers

Authors:
John Shawe-Taylor;Nello Cristianini
Affiliations:
-;-
Venue:
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Year:
1999

Citing 9
Cited 5

Support-Vector Networks

Machine Learning
A data-dependent skeleton estimate for learning

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
A framework for structural risk minimisation

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Scale-sensitive dimensions, uniform convergence, and learnability

Journal of the ACM (JACM)
Entropy Numbers, Operators and Support Vector Kernels

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Efficient distribution-free learning of probabilistic concepts

SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

IEEE Transactions on Information Theory
Structural risk minimization over data-dependent hierarchies

IEEE Transactions on Information Theory

Margin Distribution Bounds on Generalization

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Entropy Numbers, Operators and Support Vector Kernels

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Data-Dependent Margin-Based Generalization Bounds for Classification

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Manifold regularization based semisupervised semiparametric regression

Neurocomputing
On the doubt about margin explanation of boosting

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is known that the covering numbers of a function class on a double sample (length 2m) can be used to bound the generalization performance of a classifier by using a margin based analysis. In this paper we show that one can utilize an analogous argument in terms of the observed covering numbers on a single m-sample (being the actual observed data points). The significance of this is that for certain interesting classes of functions, such as support vector machines, there are new techniques which allow one to find good estimates for such covering numbers in terms of the speed of decay of the eigenvalues of a Gram matrix. These covering numbers can be much less than a priori bounds indicate in situations where the particular data received is "easy". The work can be considered an extension of previous results which provided generalization performance bounds in terms of the VC-dimension of the class of hypotheses restricted to the sample, with the considerable advantage that the covering numbers can be readily computed, and they often are small.