A Brief Index for Proximity Searching

Authors:
Eric Sadit Téllez;Edgar Chávez;Antonio Camarena-Ibarrola
Affiliations:
Universidad Michoacana,;Universidad Michoacana, and CICESE,;Universidad Michoacana,
Venue:
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Year:
2009

Citing 6
Cited 4

Searching in metric spaces

ACM Computing Surveys (CSUR)
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Modern Information Retrieval

Modern Information Retrieval
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Effective Proximity Retrieval by Ordering Permutations

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate similarity search in metric spaces using inverted files

Proceedings of the 3rd international conference on Scalable information systems

On locality sensitive hashing in metric spaces

Proceedings of the Third International Conference on SImilarity Search and APplications
Succinct nearest neighbor search

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Scalable pattern search analysis

MCPR'11 Proceedings of the Third Mexican conference on Pattern recognition
Parallel approaches to permutation-based indexing using inverted files

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many pattern recognition tasks can be modeled as proximity searching. Here the common task is to quickly find all the elements close to a given query without sequentially scanning a very large database. A recent shift in the searching paradigm has been established by using permutations instead of distances to predict proximity. Every object in the database record how the set of reference objects (the permutants) is seen , i.e. only the relative positions are used. When a query arrives the relative displacements in the permutants between the query and a particular object is measured. This approach turned out to be the most efficient and scalable, at the expense of loosing recall in the answers. The permutation of every object is represented with *** short integers in practice, producing bulky indexes of 16 ***n bits. In this paper we show how to represent the permutation as a binary vector, using just one bit for each permutant (instead of log*** in the plain representation). The Hamming distance in the binary signature is used then to predict proximity between objects in the database. We tested this approach with many real life metric databases obtaining faster queries with a recall close to the Spearman ρ using 16 times less space.