An algorithm for finding nearest neighbours in (approximately) constant average time
Pattern Recognition Letters
Reducing the overhead of the AESA metric-space nearest neighbour searching algorithm
Information Processing Letters
A fast branch & bound nearest neighbour classifier in metric spaces
Pattern Recognition Letters
An optimal algorithm for approximate nearest neighbor searching
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
Modern Information Retrieval
Searching in metric spaces with user-defined and approximate distances
ACM Transactions on Database Systems (TODS)
Probabilistic proximity search: fighting the curse of dimensionality in metric spaces
Information Processing Letters
t-Spanners as a Data Structure for Metric Space Searching
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
SIAM Journal on Discrete Mathematics
Probabilistic proximity searching algorithms based on compact partitions
Journal of Discrete Algorithms - SPIRE 2002
Proximity searching in high dimensional spaces with a proximity preserving order
MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
CM-tree: A dynamic clustered index for similarity search in metric databases
Data & Knowledge Engineering
CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Parallel query processing on distributed clustering indexes
Journal of Discrete Algorithms
Solving similarity joins and range queries in metric spaces with the list of twin clusters
Journal of Discrete Algorithms
Fast error-tolerant search on very large texts
Proceedings of the 2009 ACM symposium on Applied Computing
Fast k most similar neighbor classifier for mixed data (tree k-MSN)
Pattern Recognition
Simple space-time trade-offs for AESA
WEA'07 Proceedings of the 6th international conference on Experimental algorithms
Fast k most similar neighbor classifier for mixed data based on approximating and eliminating
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Ptolemaic indexing of the signature quadratic form distance
Proceedings of the Fourth International Conference on SImilarity Search and APplications
Versatile probability-based indexing for approximate similarity search
Proceedings of the Fourth International Conference on SImilarity Search and APplications
Dynamic optimization of queries in pivot-based indexing
Multimedia Tools and Applications
Efficient fuzzy search in large text collections
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Proximity searching consists in retrieving from a database those elements that are similar to a query. As the distance is usually expensive to compute, the goal is to use as few distance computations as possible to satisfy queries. Indexes use precomputed distances among database elements to speed up queries. As such, a baseline is AESA, which stores all the distances among database objects, but has been unbeaten in query performance for 20 years. In this paper we show that it is possible to improve upon AESA by using a radically different method to select promising database elements to compare against the query. Our experiments show improvements of up to 75% in document databases. We also explore the usage of our method as a probabilistic algorithm that may lose relevant answers. On a database of faces where any exact algorithm must examine virtually all elements, our probabilistic version obtains 85% of the correct answers by scanning only 10% of the database.