Priority Vantage Points Structures for Similarity Queries in Metric Spaces

Authors:
Cengiz Celik
Affiliations:
-
Venue:
EurAsia-ICT '02 Proceedings of the First EurAsian Conference on Information and Communication Technology
Year:
2002

Citing 10
Cited 0

An algorithm for finding nearest neighbours in (approximately) constant average time

Pattern Recognition Letters
Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A Simple Algorithm for Nearest Neighbor Search in High Dimensions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Some approaches to best-match file searching

Communications of the ACM
Searching in metric spaces

ACM Computing Surveys (CSUR)
Fixed Queries Array: A Fast and Economical Data Structure for Proximity Searching

Multimedia Tools and Applications
Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Spaghettis: An Array Based Algorithm for Similarity Queries in Metric Spaces

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware

Quantified Score

Hi-index	0.00

Visualization

Abstract

Similarity search structures for metric spaces have different performance characteristics depending on the properties of the data, construction cost, and space consumption. Nonetheless, recent experiments seem to favor vantage points-based methods such as LAESA, Spaghettis, and FQA if they are allowed to use enough pivots. By using more pre-processing time, these methods can produce superior query performance in terms of distance computations. Unfortunately this also causes them to use more space and CPU time than other structures. In this paper we explore ways to organize the basic structure according to distance relations between database objects, pivots and query objects. We introduce the priority vantage points method, which reduces the CPU overhead without adding extra space requirements. The Kvp structure is also introduced as an improvement, which stores less distance values than other vantage points algorithms. Kvp needs one sequential scan over the index data, making it very suitable to be stored on disk. We show that Kvp is superior to the other methods given same amount of storage.