An algorithm for finding nearest neighbours in (approximately) constant average time
Pattern Recognition Letters
A cost model for similarity queries in metric spaces
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
ACM Computing Surveys (CSUR)
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Fast Indexing and Visualization of Metric Data Sets using Slim-Trees
IEEE Transactions on Knowledge and Data Engineering
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
MB+Tree: A Dynamically Updatable Metric Index for Similarity Searches
WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Spaghettis: An Array Based Algorithm for Similarity Queries in Metric Spaces
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Pivot selection techniques for proximity searching in metric spaces
Pattern Recognition Letters
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
R-Trees: Theory and Applications (Advanced Information and Knowledge Processing)
R-Trees: Theory and Applications (Advanced Information and Knowledge Processing)
Engineering efficient metric indexes
Pattern Recognition Letters
An effective cost model for similarity queries in metric spaces
Proceedings of the 2007 ACM symposium on Applied computing
The VLDB Journal — The International Journal on Very Large Data Bases
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A Dynamic Pivot Selection Technique for Similarity Search
SISAP '08 Proceedings of the First International Workshop on Similarity Search and Applications (sisap 2008)
Spatial Selection of Sparse Pivots for Similarity Search in Metric Spaces
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
On the least cost for proximity searching in metric spaces
WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
Ptolemaic indexing of the signature quadratic form distance
Proceedings of the Fourth International Conference on SImilarity Search and APplications
Hi-index | 0.00 |
This paper evaluates the use of standard database indexes and query processing as a way to do metric indexing in the LAESA approach. By utilizing B-trees and R-trees as pivot-based indexes, we may use well-known optimization techniques from the database field within metric indexing and search. The novelty of this paper is that we use a cost-based approach to dynamically evaluate which and how many pivots to use in the evaluation of each query. By a series of measurements using our database prototype we are able to evaluate the performance of this approach. Compared to using all available pivots for filtering, the optimized approach gives half the response times for main memory data, but much more varied results for disk resident data. However, by use of the cost model we are able to dynamically determine when to bypass the indexes and simply perform a sequential scan of the base data. The conclusion of this evaluation is that it is beneficial to create many pivots, but to use only the most selective ones during evaluation of each query. R-trees give better performance than B-trees when utilizing all pivots, but when being able to dynamically select the best pivots, B-trees often provide better performance.