A theory of learning with similarity functions

Authors:
Maria-Florina Balcan;Avrim Blum;Nathan Srebro
Affiliations:
Computer Science Department, Carnegie Mellon University, Pittsburgh, USA 15213-3891;Computer Science Department, Carnegie Mellon University, Pittsburgh, USA 15213-3891;Toyota Technological Institute at Chicago, Chicago, USA 60637
Venue:
Machine Learning
Year:
2008

Citing 19
Cited 11

A theory of the learnable

Communications of the ACM
The Strength of Weak Learnability

Machine Learning
Semi-infinite programming: theory, methods, and applications

SIAM Review
An introduction to computational learning theory

An introduction to computational learning theory
The hardness of approximate optima in lattices, codes, and systems of linear equations

Journal of Computer and System Sciences - Special issue: papers from the 32nd and 34th annual symposia on foundations of computer science, Oct. 2–4, 1991 and Nov. 3–5, 1993
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
AI Game Programming Wisdom

AI Game Programming Wisdom
Learning in Neural Networks: Theoretical Foundations

Learning in Neural Networks: Theoretical Foundations
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Rademacher and gaussian complexities: risk bounds and structural results

The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Learning the Kernel Matrix with Semidefinite Programming

The Journal of Machine Learning Research
Agnostically Learning Halfspaces

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
On a theory of learning with similarity functions

ICML '06 Proceedings of the 23rd international conference on Machine learning
Kernels as features: On kernels, margins, and low-dimensional mappings

Machine Learning
On learning with dissimilarity functions

Proceedings of the 24th international conference on Machine learning
A discriminative framework for clustering via similarity functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing

Theory and algorithm for learning with dissimilarity functions

Neural Computation
Top-Down Approach to Image Similarity Measures

ICCVG 2008 Proceedings of the International Conference on Computer Vision and Graphics: Revised Papers
Kernels for Periodic Time Series Arising in Astronomy

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Linear Programming Boosting by Column and Row Generation

DS '09 Proceedings of the 12th International Conference on Discovery Science
Enhanced email spam filtering through combining similarity graphs

Proceedings of the fourth ACM international conference on Web search and data mining
Efficient kernel functions for support vector machine regression model for analog circuits' performance evaluation

Analog Integrated Circuits and Signal Processing
On the usefulness of similarity based projection spaces for transfer learning

SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
Supervised segmentation of fiber tracts

SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
Beam search algorithms for multilabel learning

Machine Learning
A similarity based learning framework for interim analysis of outcome prediction of acupuncture for neck pain

International Journal of Data Mining and Bioinformatics
Application of Reduction of the Set of Conditional Attributes in the Process of Global Decision-making

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Kernel functions have become an extremely popular tool in machine learning, with an attractive theory as well. This theory views a kernel as implicitly mapping data points into a possibly very high dimensional space, and describes a kernel function as being good for a given learning problem if data is separable by a large margin in that implicit space. However, while quite elegant, this theory does not necessarily correspond to the intuition of a good kernel as a good measure of similarity, and the underlying margin in the implicit space usually is not apparent in "natural" representations of the data. Therefore, it may be difficult for a domain expert to use the theory to help design an appropriate kernel for the learning task at hand. Moreover, the requirement of positive semi-definiteness may rule out the most natural pairwise similarity functions for the given problem domain.In this work we develop an alternative, more general theory of learning with similarity functions (i.e., sufficient conditions for a similarity function to allow one to learn well) that does not require reference to implicit spaces, and does not require the function to be positive semi-definite (or even symmetric). Instead, our theory talks in terms of more direct properties of how the function behaves as a similarity measure. Our results also generalize the standard theory in the sense that any good kernel function under the usual definition can be shown to also be a good similarity function under our definition (though with some loss in the parameters). In this way, we provide the first steps towards a theory of kernels and more general similarity functions that describes the effectiveness of a given function in terms of natural similarity-based properties.