Sparseness vs Estimating Conditional Probabilities: Some Asymptotic Results

Authors:
Peter L. Bartlett;Ambuj Tewari
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2007

Citing 4
Cited 11

Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
Covering number bounds of certain regularized linear function classes

The Journal of Machine Learning Research
Sparseness of support vector machines

The Journal of Machine Learning Research
Consistency of support vector machines and other regularized kernel classifiers

IEEE Transactions on Information Theory

Deformation of log-likelihood loss function for multiclass boosting

Neural Networks
Entropy and margin maximization for structured output learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Composite Binary Losses

The Journal of Machine Learning Research
Maximal Discrepancy for Support Vector Machines

Neurocomputing
L2-SVM: Dependence on the regularization parameter

Pattern Recognition and Image Analysis
Properties of the solution of L2-Support Vector Machine as a function of regularization parameter

Pattern Recognition and Image Analysis
Statistical models and learning algorithms for ordinal regression problems

Information Fusion
Coherence functions with applications in large-margin classification methods

The Journal of Machine Learning Research
Asymmetric least squares support vector machine classifiers

Computational Statistics & Data Analysis
A bagging SVM to learn from positive and unlabeled examples

Pattern Recognition Letters
Conjugate relation between loss functions and uncertainty sets in classification problems

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the nice properties of kernel classifiers such as SVMs is that they often produce sparse solutions. However, the decision functions of these classifiers cannot always be used to estimate the conditional probability of the class label. We investigate the relationship between these two properties and show that these are intimately related: sparseness does not occur when the conditional probabilities can be unambiguously estimated. We consider a family of convex loss functions and derive sharp asymptotic results for the fraction of data that becomes support vectors. This enables us to characterize the exact trade-off between sparseness and the ability to estimate conditional probabilities for these loss functions.