Stabilizing the recall in similarity search

  • Authors:
  • Martin Kyselak;David Novak;Pavel Zezula

  • Affiliations:
  • Masaryk University Brno, Czech republic;Masaryk University Brno, Czech republic;Masaryk University Brno, Czech republic

  • Venue:
  • Proceedings of the Fourth International Conference on SImilarity Search and APplications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recent techniques for approximate similarity search focus on optimizing answer precision/recall and they typically improve the average of these measures over a set of sample queries. However, according to our observation, the recall for particular indexes and queries can fluctuate considerably. In order to stabilize the recall, we propose a query-evaluation model that exploits several variants of the search index. This approach is applicable to a significant subset of current approximate methods with a focus on techniques based purely on metric postulates. Applying this approach to the M-Index structure, we perform extensive measurements on large datasets and we show that this approach has a positive impact on the recall stability and it suppresses the most unsatisfactory cases. Further, the results indicate that the proposed approach can also increase the general average recall for given overall search costs.