Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators

Authors:
R. C. Williamson;A. J. Smola;B. Scholkopf
Affiliations:
Res. Sch. of Inf. Sci., Australian Nat. Univ., Canberra, ACT;-;-
Venue:
IEEE Transactions on Information Theory
Year:
2001

Citing 0
Cited 33

The covering number in learning theory

Journal of Complexity
Regularized Principal Manifolds

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Rademacher and Gaussian Complexities: Risk Bounds and Structural Results

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Learning Relatively Small Classes

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Geometric Parameters of Kernel Machines

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
On different facets of regularization theory

Neural Computation
A tutorial on support vector regression

Statistics and Computing
Support Vector Machine Soft Margin Classifiers: Error Analysis

The Journal of Machine Learning Research
Kernel Matched Subspace Detectors for Hyperspectral Target Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning Bounds for Kernel Regression Using Effective Data Dimensionality

Neural Computation
Estimating the Support of a High-Dimensional Distribution

Neural Computation
Statistical properties of kernel principal component analysis

Machine Learning
The performance bounds of learning machines based on exponentially strongly mixing sequences

Computers & Mathematics with Applications
Nonparametric Quantile Estimation

The Journal of Machine Learning Research
Estimates of covering numbers of convex sets with slowly decaying orthogonal subsets

Discrete Applied Mathematics
The covering number for some Mercer kernel Hilbert spaces

Journal of Complexity
Aggregation of SVM Classifiers Using Sobolev Spaces

The Journal of Machine Learning Research
On Relevant Dimensions in Kernel Feature Spaces

The Journal of Machine Learning Research
Generalization performance of ν-support vector classifier based on conditional value-at-risk minimization

Neurocomputing
Oracle inequalities for support vector machines that are based on random entropy numbers

Journal of Complexity
A framework for kernel-based multi-category classification

Journal of Artificial Intelligence Research
Review: Using support vector machines in diagnoses of urological dysfunctions

Expert Systems with Applications: An International Journal
Classification Using Geometric Level Sets

The Journal of Machine Learning Research
Manifold regularization based semisupervised semiparametric regression

Neurocomputing
Estimation of learning rate of least square algorithm via Jackson operator

Neurocomputing
Logistic classification with varying Gaussians

Computers & Mathematics with Applications
Covering numbers of Gaussian reproducing kernel Hilbert spaces

Journal of Complexity
The consistency analysis of coefficient regularized classification with convex loss

WSEAS Transactions on Mathematics
Consistency and error analysis of Prior-Knowledge-Based Kernel Regression

Neurocomputing
Predicting seminal quality with artificial intelligence methods

Expert Systems with Applications: An International Journal
Analysis of convergence performance of neural networks ranking algorithm

Neural Networks
On the convergence rate of lp-norm multiple kernel learning

The Journal of Machine Learning Research
Using robust dispersion estimation in support vector machines

Pattern Recognition

Quantified Score

Hi-index	754.84

Visualization

Abstract

We derive new bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks by obtaining new bounds on their covering numbers. The proofs make use of a viewpoint that is apparently novel in the field of statistical learning theory. The hypothesis class is described in terms of a linear operator mapping from a possibly infinite-dimensional unit ball in feature space into a finite-dimensional space. The covering numbers of the class are then determined via the entropy numbers of the operator. These numbers, which characterize the degree of compactness of the operator can be bounded in terms of the eigenvalues of an integral operator induced by the kernel function used by the machine. As a consequence, we are able to theoretically explain the effect of the choice of kernel function on the generalization performance of support vector machines