Computational geometry: an introduction
Computational geometry: an introduction
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Cover trees for nearest neighbor
ICML '06 Proceedings of the 23rd international conference on Machine learning
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Lessons from the Netflix prize challenge
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Random projection trees and low dimensional manifolds
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Efficient retrieval of recommendations in a matrix factorization framework
Proceedings of the 21st ACM international conference on Information and knowledge management
Indexed block coordinate descent for large-scale linear classification with limited memory
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
The problem of efficiently finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied. However, the closely related problem of efficiently finding the best match with respect to the inner-product has never been explored in the general setting to the best of our knowledge. In this paper we consider this problem and contrast it with the previous problems considered. First, we propose a general branch-and-bound algorithm based on a (single) tree data structure. Subsequently, we present a dual-tree algorithm for the case where there are multiple queries. Our proposed branch-and-bound algorithms are based on novel inner-product bounds. Finally we present a new data structure, the cone tree, for increasing the efficiency of the dual-tree algorithm. We evaluate our proposed algorithms on a variety of data sets from various applications, and exhibit up to five orders of magnitude improvement in query time over the naive search technique in some cases.