A Compression Approach to Support Vector Model Selection

Authors:
Ulrike von Luxburg;Olivier Bousquet;Bernhard Schölkopf
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2004

Citing 8
Cited 13

Elements of information theory

Elements of information theory
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension

Machine Learning
Generalization performance of support vector machines and other pattern classifiers

Advances in kernel methods
Some PAC-Bayesian Theorems

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Soft Margins for AdaBoost

Machine Learning
Sparsity vs. Large Margins for Linear Classifiers

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Bounds on Error Expectation for Support Vector Machines

Neural Computation
The minimum description length principle in coding and modeling

IEEE Transactions on Information Theory

Iterative Kernel Principal Component Analysis for Image Modeling

IEEE Transactions on Pattern Analysis and Machine Intelligence
Nearly Uniform Validation Improves Compression-Based Error Bounds

The Journal of Machine Learning Research
Combining feature-and correspondence-based methods for visual object recognition

Neural Computation
Least conservative support and tolerance tubes

IEEE Transactions on Information Theory
Building sparse representations and structure determination on LS-SVM substrates

Neurocomputing
Minimum description length principle in the field of image analysis and pattern recognition

Pattern Recognition and Image Analysis
Margin-sparsity trade-off for the set covering machine

ECML'05 Proceedings of the 16th European conference on Machine Learning
Unlabeled compression schemes for maximum classes

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Principle of representational minimum description length in image analysis and pattern recognition

Pattern Recognition and Image Analysis
A geometric approach to sample compression

The Journal of Machine Learning Research
Eigenvalues perturbation of integral operator for kernel selection

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Some new maximum VC classes

Information Processing Letters
Optimized dissimilarity space embedding for labeled graphs

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

In this paper we investigate connections between statistical learning theory and data compression on the basis of support vector machine (SVM) model selection. Inspired by several generalization bounds we construct "compression coefficients" for SVMs which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane. The main idea is to relate the coding precision to geometrical concepts such as the width of the margin or the shape of the data in the feature space. The so derived compression coefficients combine well known quantities such as the radius-margin term R2/ρ2, the eigenvalues of the kernel matrix, and the number of support vectors. To test whether they are useful in practice we ran model selection experiments on benchmark data sets. As a result we found that compression coefficients can fairly accurately predict the parameters for which the test error is minimized.