Theory and Practice of Vector Quantizers Trained on Small Training Sets

Authors:
David Cohn;Eve A. Riskin;Richard Ladner
Affiliations:
-;-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1994

Citing 10
Cited 2

Digital halftoning

Digital halftoning
Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Vector quantization and signal compression

Vector quantization and signal compression
e-approximations with minimum packing constraint violation (extended abstract)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Separating formal bounds from practical performance in learning systems

Separating formal bounds from practical performance in learning systems
How tight are the Vapnik-Chervonenkis bounds?

Neural Computation
Digital Pictures: Representation and Compression

Digital Pictures: Representation and Compression
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding

IEEE Transactions on Information Theory

Active learning in neural networks

New learning paradigms in soft computing
Vector quantization and fuzzy ranks for image reconstruction

Image and Vision Computing

Quantified Score

Hi-index	0.15

Visualization

Abstract

Examines how the performance of a memoryless vector quantizer changes as a function of its training set size. Specifically, the authors study how well the training set distortion predicts test distortion when the training set is a randomly drawn subset of blocks from the test or training image(s). Using the Vapnik-Chervonenkis (VC) dimension, the authors derive formal bounds for the difference of test and training distortion of vector quantizer codebooks. The authors then describe extensive empirical simulations that test these bounds for a variety of codebook sizes and vector dimensions, and give practical suggestions for determining the training set size necessary to achieve good generalization from a codebook. The authors conclude that, by using training sets comprising only a small fraction of the available data, one can produce results that are close to the results obtainable when all available data are used.