The Strength of Weak Learnability
Machine Learning
From on-line to batch learning
COLT '89 Proceedings of the second annual workshop on Computational learning theory
The nature of statistical learning theory
The nature of statistical learning theory
Machine Learning
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Learning in Neural Networks: Theoretical Foundations
Learning in Neural Networks: Theoretical Foundations
Advances in Large Margin Classifiers
Advances in Large Margin Classifiers
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Learning the Kernel Matrix with Semidefinite Programming
The Journal of Machine Learning Research
Perceptrons: An Introduction to Computational Geometry
Perceptrons: An Introduction to Computational Geometry
Agnostically Learning Halfspaces
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
An introduction to kernel-based learning algorithms
IEEE Transactions on Neural Networks
On learning with dissimilarity functions
Proceedings of the 24th international conference on Machine learning
A discriminative framework for clustering via similarity functions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
A theory of learning with similarity functions
Machine Learning
Theory and algorithm for learning with dissimilarity functions
Neural Computation
Expert Systems with Applications: An International Journal
Scaling up semi-supervised learning: an efficient and effective LLGC variant
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
How good is a kernel when used as a similarity measure?
COLT'07 Proceedings of the 20th annual conference on Learning theory
Selection of basis functions guided by the L2 soft margin
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Combined regression and ranking
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning good edit similarities with generalization guarantees
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Algorithms for learning kernels based on centered alignment
The Journal of Machine Learning Research
Guaranteed classification via regularized similarity learning
Neural Computation
Hi-index | 0.00 |
Kernel functions have become an extremely popular tool in machine learning, with an attractive theory as well. This theory views a kernel as implicitly mapping data points into a possibly very high dimensional space, and describes a kernel function as being good for a given learning problem if data is separable by a large margin in that implicit space. However, while quite elegant, this theory does not directly correspond to one's intuition of a good kernel as a good similarity function. Furthermore, it may be difficult for a domain expert to use the theory to help design an appropriate kernel for the learning task at hand since the implicit mapping may not be easy to calculate. Finally, the requirement of positive semi-definiteness may rule out the most natural pairwise similarity functions for the given problem domain.In this work we develop an alternative, more general theory of learning with similarity functions (i.e., sufficient conditions for a similarity function to allow one to learn well) that does not require reference to implicit spaces, and does not require the function to be positive semi-definite (or even symmetric). Our results also generalize the standard theory in the sense that any good kernel function under the usual definition can be shown to also be a good similarity function under our definition (though with some loss in the parameters). In this way, we provide the first steps towards a theory of kernels that describes the effectiveness of a given kernel function in terms of natural similarity-based properties.