Maximum inner-product search using cone trees

Authors:
Parikshit Ram;Alexander G. Gray
Affiliations:
Georgia Institute of Technology, Atlanta, GA, USA;Georgia Institute of Technology, Atlanta, GA, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 11
Cited 2

Computational geometry: an introduction

Computational geometry: an introduction
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
PAC Nearest Neighbor Queries: Approximate and Controlled Search in High-Dimensional and Metric Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Cover trees for nearest neighbor

ICML '06 Proceedings of the 23rd international conference on Machine learning
Scaling up all pairs similarity search

Proceedings of the 16th international conference on World Wide Web
Lessons from the Netflix prize challenge

ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Random projection trees and low dimensional manifolds

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Matrix Factorization Techniques for Recommender Systems

Computer

Efficient retrieval of recommendations in a matrix factorization framework

Proceedings of the 21st ACM international conference on Information and knowledge management
Indexed block coordinate descent for large-scale linear classification with limited memory

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of efficiently finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied. However, the closely related problem of efficiently finding the best match with respect to the inner-product has never been explored in the general setting to the best of our knowledge. In this paper we consider this problem and contrast it with the previous problems considered. First, we propose a general branch-and-bound algorithm based on a (single) tree data structure. Subsequently, we present a dual-tree algorithm for the case where there are multiple queries. Our proposed branch-and-bound algorithms are based on novel inner-product bounds. Finally we present a new data structure, the cone tree, for increasing the efficiency of the dual-tree algorithm. We evaluate our proposed algorithms on a variety of data sets from various applications, and exhibit up to five orders of magnitude improvement in query time over the naive search technique in some cases.