An algorithm for finding nearest neighbours in (approximately) constant average time
Pattern Recognition Letters
Distance-based indexing for high-dimensional metric spaces
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A Simple Algorithm for Nearest Neighbor Search in High Dimensions
IEEE Transactions on Pattern Analysis and Machine Intelligence
Some approaches to best-match file searching
Communications of the ACM
ACM Computing Surveys (CSUR)
Fixed Queries Array: A Fast and Economical Data Structure for Proximity Searching
Multimedia Tools and Applications
Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Spaghettis: An Array Based Algorithm for Similarity Queries in Metric Spaces
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Hi-index | 0.00 |
Similarity search structures for metric spaces have different performance characteristics depending on the properties of the data, construction cost, and space consumption. Nonetheless, recent experiments seem to favor vantage points-based methods such as LAESA, Spaghettis, and FQA if they are allowed to use enough pivots. By using more pre-processing time, these methods can produce superior query performance in terms of distance computations. Unfortunately this also causes them to use more space and CPU time than other structures. In this paper we explore ways to organize the basic structure according to distance relations between database objects, pivots and query objects. We introduce the priority vantage points method, which reduces the CPU overhead without adding extra space requirements. The Kvp structure is also introduced as an improvement, which stores less distance values than other vantage points algorithms. Kvp needs one sequential scan over the index data, making it very suitable to be stored on disk. We show that Kvp is superior to the other methods given same amount of storage.