Metric Index: An Efficient and Scalable Solution for Similarity Search

Authors:
David Novak;Michal Batko
Affiliations:
-;-
Venue:
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Year:
2009

Citing 16
Cited 11

Extendible hashing—a fast access method for dynamic files

ACM Transactions on Database Systems (TODS)
Skip graphs

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Approximate similarity retrieval with M-trees

The VLDB Journal — The International Journal on Very Large Data Bases
Searching in metric spaces by spatial approximation

The VLDB Journal — The International Journal on Very Large Data Bases
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
Pivot selection techniques for proximity searching in metric spaces

Pattern Recognition Letters
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Similarity Search: The Metric Space Approach (Advances in Database Systems)

Similarity Search: The Metric Space Approach (Advances in Database Systems)
M-Chord: a scalable distributed similarity search structure

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Dynamic spatial approximation trees

Journal of Experimental Algorithmics (JEA)
Counting Distance Permutations

SISAP '08 Proceedings of the First International Workshop on Similarity Search and Applications (sisap 2008)
MESSIF: metric similarity search implementation framework

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development

On locality-sensitive indexing in generic metric spaces

Proceedings of the Third International Conference on SImilarity Search and APplications
Audio similarity retrieval engine

Proceedings of the Third International Conference on SImilarity Search and APplications
Metric Index: An efficient and scalable solution for precise and approximate similarity search

Information Systems
Stabilizing the recall in similarity search

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Versatile probability-based indexing for approximate similarity search

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Approximate distributed metric-space search

Proceedings of the 9th workshop on Large-scale and distributed informational retrieval
Similarity caching in large-scale image retrieval

Information Processing and Management: an International Journal
Use of permutation prefixes for efficient and scalable approximate similarity search

Information Processing and Management: an International Journal
Modelling efficient novelty-based search result diversification in metric spaces

Journal of Discrete Algorithms
On Combining Sequence Alignment and Feature-Quantization for Sub-Image Searching

International Journal of Multimedia Data Engineering & Management
Efficiency and security in similarity cloud services

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Metric space as a universal and versatile model of similarity can be applied in various areas of non-text information retrieval. However, a general, efficient and scalable solution for metric data management is still a resisting research challenge. We introduce a novel indexing and searching mechanism called Metric Index (M-Index), that employs practically all known principles of metric space partitioning, pruning and filtering. The heart of the M-Index is a general mapping mechanism that enables to actually store the data in well-established structures such as the B+-tree or even in a distributed storage. We have implemented the M-Index with B+-tree and performed experiments on a combination of five MPEG-7 descriptors in a database of hundreds of thousands digital images. The experiments put under test several M-Index variants and compare them with two orthogonal approaches – the PM-Tree and the iDistance. The trials show that the M-Index outperforms the others in terms of efficiency of search-space pruning, I/O costs, and response times for precise similarity queries. Furthermore, the M-Index demonstrates an excellent ability to keep similar data close in the index which makes its approximation algorithm very efficient – maintaining practically constant response times while preserving a very high recall as the dataset grows.