Learning distance function by coding similarity

Authors:
Aharon Bar Hillel;Daphna Weinshall
Affiliations:
Intel research, Haifa, Israel;The Hebrew University of Jerusalem, Jerusalem, Israel
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 12
Cited 8

Elements of information theory

Elements of information theory
Comparing images using color coherence vectors

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Shape Matching and Object Recognition Using Shape Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Adjustment Learning and Relevant Component Analysis

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
From Few to Many: Generative Models for Recognition Under Variable Pose and Illumination

FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Boosting margin based distance functions for clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning a Mahalanobis Metric from Equivalence Constraints

The Journal of Machine Learning Research
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Information distance

IEEE Transactions on Information Theory

Accuracy of distance metric learning algorithms

Proceedings of the 2nd Workshop on Data Mining using Matrices and Tensors
Technical opinion: Steering self-learning distance algorithms

Communications of the ACM - Scratch Programming for All
Human Age Estimation by Metric Learning for Regression Problems

EMMCVPR '09 Proceedings of the 7th International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition
Learning "forgiving" hash functions: algorithms and large scale tests

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Robust distance metric learning with auxiliary knowledge

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Automatic handling of digital image repositories: a brief survey

ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Adapted transfer of distance measures for quantitative structure-activity relationships

DS'10 Proceedings of the 13th international conference on Discovery science
Pairwise support vector machines and their application to large scale problems

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of learning a similarity function from a set of positive equivalence constraints, i.e. 'similar' point pairs. We define the similarity in information theoretic terms, as the gain in coding length when shifting from independent encoding of the pair to joint encoding. Under simple Gaussian assumptions, this formulation leads to a non-Mahalanobis similarity function which is efficient and simple to learn. This function can be viewed as a likelihood ratio test, and we show that the optimal similarity-preserving projection of the data is a variant of Fisher Linear Discriminant. We also show that under some naturally occurring sampling conditions of equivalence constraints, this function converges to a known Mahalanobis distance (RCA). The suggested similarity function exhibits superior performance over alternative Mahalanobis distances learnt from the same data. Its superiority is demonstrated in the context of image retrieval and graph based clustering, using a large number of data sets.