Digital halftoning
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM (JACM)
Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Vector quantization and signal compression
Vector quantization and signal compression
e-approximations with minimum packing constraint violation (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Separating formal bounds from practical performance in learning systems
Separating formal bounds from practical performance in learning systems
How tight are the Vapnik-Chervonenkis bounds?
Neural Computation
Digital Pictures: Representation and Compression
Digital Pictures: Representation and Compression
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
IEEE Transactions on Information Theory
Active learning in neural networks
New learning paradigms in soft computing
Vector quantization and fuzzy ranks for image reconstruction
Image and Vision Computing
Hi-index | 0.15 |
Examines how the performance of a memoryless vector quantizer changes as a function of its training set size. Specifically, the authors study how well the training set distortion predicts test distortion when the training set is a randomly drawn subset of blocks from the test or training image(s). Using the Vapnik-Chervonenkis (VC) dimension, the authors derive formal bounds for the difference of test and training distortion of vector quantizer codebooks. The authors then describe extensive empirical simulations that test these bounds for a variety of codebook sizes and vector dimensions, and give practical suggestions for determining the training set size necessary to achieve good generalization from a codebook. The authors conclude that, by using training sets comprising only a small fraction of the available data, one can produce results that are close to the results obtainable when all available data are used.