Approximate similarity retrieval with M-trees

  • Authors:
  • Pavel Zezula;Pasquale Savino;Giuseppe Amato;Fausto Rabitti

  • Affiliations:
  • CNUCE-CNR, Via S. Maria, 36, 56126 Pisa, Italy/ E-mail: zezula@iei.pi.cnr.it, F.Rabitti@cnuce.cnr.it;IEI-CNR, Via S. Maria, 46, 56126 Pisa, Italy, E-mail: {P.Savino, G.Amato}@iei.pi.cnr.it;IEI-CNR, Via S. Maria, 46, 56126 Pisa, Italy, E-mail: {P.Savino, G.Amato}@iei.pi.cnr.it;CNUCE-CNR, Via S. Maria, 36, 56126 Pisa, Italy/ E-mail: zezula@iei.pi.cnr.it, F.Rabitti@cnuce.cnr.it

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Motivated by the urgent need to improve the efficiency of similarity queries, approximate similarity retrieval is investigated in the environment of a metric tree index called the M-tree. Three different approximation techniques are proposed, which show how to forsake query precision for improved performance. Measures are defined that can quantify the improvements in performance efficiency and the quality of approximations. The proposed approximation techniques are then tested on various synthetic and real-life files. The evidence obtained from the experiments confirms our hypothesis that a high-quality approximated similarity search can be performed at a much lower cost than that needed to obtain the exact results. The proposed approximation techniques are scalable and appear to be independent of the metric used. Extensions of these techniques to the environments of other similarity search indexes are also discussed.