How good is a kernel when used as a similarity measure?

Authors:
Nathan Srebro
Affiliations:
Toyota Technological Institute-Chicago, IL and IBM Haifa Research Lab, Israel
Venue:
COLT'07 Proceedings of the 20th annual conference on Learning theory
Year:
2007

Citing 4
Cited 3

Semi-infinite programming: theory, methods, and applications

SIAM Review
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
Rademacher and gaussian complexities: risk bounds and structural results

The Journal of Machine Learning Research
On a theory of learning with similarity functions

ICML '06 Proceedings of the 23rd international conference on Machine learning

A discriminative framework for clustering via similarity functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Theory and algorithm for learning with dissimilarity functions

Neural Computation
A class possibility based kernel to increase classification accuracy for small data sets using support vector machines

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, Balcan and Blum [1] suggested a theory of learning based on general similarity functions, instead of positive semidefinite kernels. We study the gap between the learning guarantees based on kernel-based learning, and those that can be obtained by using the kernel as a similarity function, which was left open by Balcan and Blum. We provide a significantly improved bound on how good a kernel function is when used as a similarity function, and extend the result also to the more practically relevant hinge-loss rather then zero-one-error-rate. Furthermore, we show that this bound is tight, and hence establish that there is in-fact a real gap between the traditional kernel-based notion of margin and the newer similarity-based notion.