Impact of the initialization in tree-based fast similarity search techniques

  • Authors:
  • Aureo Serrano;Luisa Micó;Jose Oncina

  • Affiliations:
  • Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Alicante, Spain;Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Alicante, Spain;Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Alicante, Spain

  • Venue:
  • SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen. This article explores several techniques to choose the first pivot in a tree-based fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity. Moreover, most pivot tree-based indexes emphasizes in building balanced trees.We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones.