Learning a distance metric for object identification without human supervision

  • Authors:
  • Satoshi Oyama;Katsumi Tanaka

  • Affiliations:
  • Department of Social Informatics, Graduate School of Informatics, Kyoto University, Kyoto, Japan;Department of Social Informatics, Graduate School of Informatics, Kyoto University, Kyoto, Japan

  • Venue:
  • PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A method is described for learning a distance metric for use in object identification that does not require human supervision. It is based on two assumptions. One is that pairs of different names refer to different objects. The other is that names are arbitrary. These two assumptions justify using pairs of data items for objects with different names as “cannot-be-linked” example pairs for learning a distance metric for use in clustering ambiguous names. The metric learning is formulated using only dissimilar example pairs as a convex quadratic programming problem that can be solved much faster than a semi-definite programming problem, which generally must be solved to learn a distance metric matrix. Experiments on author identification using a bibliographic database showed that the learned metric improves identification F-measure.