The list of clusters revisited

  • Authors:
  • Eric Sadit Tellez;Edgar Chávez

  • Affiliations:
  • Universidad Michoacana de San Nicolás de Hidalgo, México;Universidad Michoacana de San Nicolás de Hidalgo, México

  • Venue:
  • MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most efficient index for similarity search, to fix ideas think in speeding up k-nn searches in a very large database, is the so called list of clusters. This data structure is a counterintuitive construction which can be seen as extremely unbalanced, as opposed to balanced data structures for exact searching. In practical terms there is no better alternative for exact indexing, when every search return all the incumbent results; as opposed to approximate similarity search. The major drawback of the list of clusters is its quadratic time construction. In this paper we revisit the list of clusters aiming at speeding up the construction time without sacrificing its efficiency. We obtain similar search times while gaining a significant amount of time in the construction phase.