Covering number bounds of certain regularized linear function classes

Authors:
Tong Zhang
Affiliations:
T.J. Watson Research Center, Route 134, Yorktown Heights, NY
Venue:
The Journal of Machine Learning Research
Year:
2002

Citing 20
Cited 32

Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Potential-reduction methods in mathematical programming

Mathematical Programming: Series A and B - Special issue: interior point methods in theory and practice
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Scale-sensitive dimensions, uniform convergence, and learnability

Journal of the ACM (JACM)
Generalization performance of support vector machines and other pattern classifiers

Advances in kernel methods
Entropy numbers, operators and support vector kernels

Advances in kernel methods
Theoretical analysis of a class of randomized regularization methods

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
PAC-Bayesian model averaging

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Covering numbers for support vector machines

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Linear hinge loss and average margin

Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
General Convergence Results for Linear Discriminant Updates

Machine Learning
On the Dual Formulation of Regularized Linear Systems with Convex Risks

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
A Note on a Scale-Sensitive Dimension of Linear Bounded Functionals in Banach Spaces

ALT '97 Proceedings of the 8th International Conference on Algorithmic Learning Theory
Entropy Numbers of Linear Function Classes

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Efficient agnostic learning of neural networks with bounded fan-in

IEEE Transactions on Information Theory - Part 2
The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

IEEE Transactions on Information Theory
Structural risk minimization over data-dependent hierarchies

IEEE Transactions on Information Theory

Generalization error bounds for Bayesian mixture algorithms

The Journal of Machine Learning Research
Generalization Error Bounds for Threshold Decision Lists

The Journal of Machine Learning Research
Statistical Analysis of Some Multi-Category Large Margin Classification Methods

The Journal of Machine Learning Research
Finite time bounds for sampling based fitted value iteration

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning Bounds for Kernel Regression Using Effective Data Dimensionality

Neural Computation
SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming

Neural Computation
Parameter estimation for statistical parsing models: theory and practice of distribution-free methods

New developments in parsing technology
On the generalization error of fixed combinations of classifiers

Journal of Computer and System Sciences
Sparseness vs Estimating Conditional Probabilities: Some Asymptotic Results

The Journal of Machine Learning Research
Structure compilation: trading structure for features

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Learning with Lq

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Generalization Bounds for Some Ordinal Regression Algorithms

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Nonparametric conditional density estimation using piecewise-linear solution path of kernel quantile regression

Neural Computation
l1 regularization in infinite dimensional feature spaces

COLT'07 Proceedings of the 20th annual conference on Learning theory
Large margin cost-sensitive learning of conditional random fields

Pattern Recognition
The necessity of combining adaptation methods

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Sequence classification via large margin hidden Markov models

Data Mining and Knowledge Discovery
Probabilities of discrepancy between minima of cross-validation, Vapnik bounds and true risks

International Journal of Applied Mathematics and Computer Science
A comparison of complexity selection approaches for polynomials based on: vapnik-chervonenkis dimension, rademacher complexity and covering numbers

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Sample complexity of linear learning machines with different restrictions over weights

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Diversity regularized machine

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
A boosting approach for supervised Mahalanobis distance metric learning

Pattern Recognition
Diversity regularized ensemble pruning

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Approximation and estimation bounds for free knot splines

Computers & Mathematics with Applications
Logistic regression with weight grouping priors

Computational Statistics & Data Analysis
PAC-bayes bounds with data dependent priors

The Journal of Machine Learning Research
Semi-supervised learning of hidden conditional random fields for time-series classification

Neurocomputing
Generalization ability of fractional polynomial models

Neural Networks
Compressed classification learning with Markov chain samples

Neural Networks
Machine learning with operational costs

The Journal of Machine Learning Research
Distribution-dependent sample complexity of large margin learning

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, sample complexity bounds have been derived for problems involving linear functions such as neural networks and support vector machines. In many of these theoretical studies, the concept of covering numbers played an important role. It is thus useful to study covering numbers for linear function classes. In this paper, we investigate two closely related methods to derive upper bounds on these covering numbers. The first method, already employed in some earlier studies, relies on the so-called Maurey's lemma; the second method uses techniques from the mistake bound framework in online learning. We compare results from these two methods, as well as their consequences in some learning formulations.