Approximate similarity retrieval with M-trees

Authors:
Pavel Zezula;Pasquale Savino;Giuseppe Amato;Fausto Rabitti
Affiliations:
CNUCE-CNR, Via S. Maria, 36, 56126 Pisa, Italy/ E-mail: zezula@iei.pi.cnr.it, F.Rabitti@cnuce.cnr.it;IEI-CNR, Via S. Maria, 46, 56126 Pisa, Italy, E-mail: {P.Savino, G.Amato}@iei.pi.cnr.it;IEI-CNR, Via S. Maria, 46, 56126 Pisa, Italy, E-mail: {P.Savino, G.Amato}@iei.pi.cnr.it;CNUCE-CNR, Via S. Maria, 36, 56126 Pisa, Italy/ E-mail: zezula@iei.pi.cnr.it, F.Rabitti@cnuce.cnr.it
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
1998

Citing 16
Cited 28

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Geometric range searching

ACM Computing Surveys (CSUR)
Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for similarity queries in metric spaces

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Approximate String Matching

ACM Computing Surveys (CSUR)
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Comparing Images Using the Hausdorff Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Processing Complex Similarity Queries with Distance-Based Access Methods

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Content-Based Image Indexing

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Processing M-trees with Parallel Resources

RIDE '98 Proceedings of the Workshop on Research Issues in Database Engineering

Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
VQ-index: an index structure for similarity searching in multimedia databases

Proceedings of the tenth ACM international conference on Multimedia
Region proximity in metric spaces and its use for approximate similarity search

ACM Transactions on Information Systems (TOIS)
Navigating massive data sets via local clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic proximity searching algorithms based on compact partitions

Journal of Discrete Algorithms - SPIRE 2002
Fast Approximate Similarity Search in Extremely High-Dimensional Data Sets

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Query-sensitive embeddings

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Accelerating approximate similarity queries using genetic algorithms

Proceedings of the 2005 ACM symposium on Applied computing
High dimensional nearest neighbor searching

Information Systems
Query-sensitive embeddings

ACM Transactions on Database Systems (TODS)
Genetic algorithms for approximate similarity queries

Data & Knowledge Engineering
Unified framework for fast exact and approximate search in dissimilarity spaces

ACM Transactions on Database Systems (TODS)
BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
A posteriori multi-probe locality sensitive hashing

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Approximate similarity search in metric spaces using inverted files

Proceedings of the 3rd international conference on Scalable information systems
Approximate similarity search: A multi-faceted problem

Journal of Discrete Algorithms
Metric Index: An Efficient and Scalable Solution for Similarity Search

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Building a web-scale image similarity search system

Multimedia Tools and Applications
An approach to content-based image retrieval based on the Lucene search engine library

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Metric Index: An efficient and scalable solution for precise and approximate similarity search

Information Systems
Stabilizing the recall in similarity search

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Approximate similarity search using samples

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Accelerating video identification by skipping queries with a compact metric cache

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
Modified LSI model for efficient search by metric access methods

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
SC-tree: an efficient structure for high-dimensional data indexing

BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Distributed KNN-graph approximation via hashing

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Large-scale similarity data management with distributed Metric Index

Information Processing and Management: an International Journal
Multimedia search and retrieval using multimodal annotation propagation and indexing techniques

Image Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Motivated by the urgent need to improve the efficiency of similarity queries, approximate similarity retrieval is investigated in the environment of a metric tree index called the M-tree. Three different approximation techniques are proposed, which show how to forsake query precision for improved performance. Measures are defined that can quantify the improvements in performance efficiency and the quality of approximations. The proposed approximation techniques are then tested on various synthetic and real-life files. The evidence obtained from the experiments confirms our hypothesis that a high-quality approximated similarity search can be performed at a much lower cost than that needed to obtain the exact results. The proposed approximation techniques are scalable and appear to be independent of the metric used. Extensions of these techniques to the environments of other similarity search indexes are also discussed.