New relations between similarity measures for vectors based on vector norms

  • Authors:
  • Leo Egghe

  • Affiliations:
  • Universiteit Hasselt, Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium and Universiteit Antwerpen, IBW, Stadscampus, Venusstraat 35, B-2000 Antwerpen, Belgium

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The well-known similarity measures Jaccard, Salton's cosine, Dice, and several related overlap measures for vectors are compared. While general relations are not possible to prove, we study these measures on the “trajectories” of the form $||\overrightarrow X || = a||\overrightarrow Y ||$, where a 0 is a constant and ||·|| denotes the Euclidean norm of a vector. In this case, direct functional relations between these measures are proved. For Jaccard, we prove that it is a convexly increasing function of Salton's cosine measure, but always smaller than or equal to the latter, hereby explaining a curve, experimentally found by Leydesdorff. All the other measures have a linear relation with Salton's cosine, reducing even to equality, in case a = 1. Hence, for equally normed vectors (e.g., for normalized vectors) we, essentially, only have Jaccard's measure and Salton's cosine measure since all the other measures are equal to the latter. © 2009 Wiley Periodicals, Inc.