Good properties of similarity measures and their complementarity

  • Authors:
  • Leo Egghe

  • Affiliations:
  • Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium (permanent address)/ Universiteit Antwerpen (UA), Stadscampus, Venusstraat 35, B-2000 Antwerpen, Belgium

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similarity measures, such as the ones of Jaccard, Dice, or Cosine, measure the similarity between two vectors. A good property for similarity measures would be that, if we add a constant vector to both vectors, then the similarity must increase. We show that Dice and Jaccard satisfy this property while Cosine and both overlap measures do not. Adding a constant vector is called, in Lorenz concentration theory, “nominal increase” and we show that the stronger “transfer principle” is not a required good property for similarity measures. Another good property is that, when we have two vectors and if we add one of these vectors to both vectors, then the similarity must increase. Now Dice, Jaccard, Cosine, and one of the overlap measures satisfy this property, while the other overlap measure does not. Also a variant of this latter property is studied. © 2010 Wiley Periodicals, Inc.