Limitations of Learning via Embeddings in Euclidean Half-Spaces

Authors:
Shai Ben-David;Nadav Eiron;Hans-Ulrich Simon
Affiliations:
-;-;-
Venue:
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Year:
2001

Citing 7
Cited 2

Probabilistic communication complexity

Journal of Computer and System Sciences
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Improved Generalization Through Explicit Optimization of Margins

Machine Learning
An Algorithmic Theory of Learning: Robust Concepts and Random Projection

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
A Linear Lower Bound on the Unbounded Error Probabilistic Communication Complexity

CCC '01 Proceedings of the 16th Annual Conference on Computational Complexity
Extremal Graph Theory

Extremal Graph Theory
Geometrical realization of set systems and probabilistic communication complexity

SFCS '85 Proceedings of the 26th Annual Symposium on Foundations of Computer Science

On the Smallest Possible Dimension and the Largest Possible Margin of Linear Arrangements Representing Given Concept Classes Uniform Distribution

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
Estimating the Optimal Margins of Embeddings in Euclidean Half Spaces

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers the embeddability of general concept classes in Euclidean half spaces. By embedding in half spaces we refer to a mapping from some concept class to half spaces so that the labeling given to points in the instance space is retained. The existence of an embedding for some class may be used to learn it using an algorithm for the class it is embedded into. The Support Vector Machines paradigm employs this idea for the construction of a general learning system. We show that an overwhelming majority of the family of finite concept classes of constant VC dimension d cannot be embedded in lowdimensional half spaces. (In fact, we show that the Euclidean dimension must be almost as high as the size of the instance space.) We strengthen this result even further by showing that an overwhelming majority of the family of finite concept classes of constant VC dimension d cannot be embedded in half spaces (of arbitrarily high Euclidean dimension) with a large margin. (In fact, the margin cannot be substantially larger than the margin achieved by the trivial embedding.) Furthermore, these bounds are robust in the sense that allowing each image half space to err on a small fraction of the instances does not imply a significant weakening of these dimension and margin bounds. Our results indicate that any universal learning machine, which transforms data into the Euclidean space and then applies linear (or large margin) classification, cannot enjoy any meaningful generalization guarantees that are based on either VC dimension or margins considerations.