Bounds on lengths of real valued vectors similar with regard to the tanimoto similarity

  • Authors:
  • Marzena Kryszkiewicz

  • Affiliations:
  • Institute of Computer Science, Warsaw University of Technology, Warsaw, Poland

  • Venue:
  • ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Tanimoto similarity measure finds numerous applications in chemistry, bio-informatics, information retrieval and text mining. A typical task in these applications is finding most similar vectors. The task is very time consuming in the case of very large data sets. Thus methods that allow for efficient restriction of the number of vectors that have a chance to be sufficiently similar to a given vector are of high importance. To this end, recently, we have derived bounds on lengths of vectors similar with respect to the Tanimoto similarity. In this paper, we recall those results and derive new bounds on lengths of real valued vectors that have a chance to be Tanimoto similar to a given vector in a required degree. Finally, we compare the previous and current results and illustrate their usefulness.