An algorithm for finding nearest neighbours in (approximately) constant average time
Pattern Recognition Letters
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Multidimensional binary search trees used for associative searching
Communications of the ACM
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Action Recognition Based on Non-parametric Probability Density Function Estimation
ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II
Hi-index | 0.02 |
Nearest Neighbor (NN) search is a basic algorithm for data mining and machine learning applications. However, its acceleration in high dimensional space is a difficult problem. For solving this problem, approximate NN search algorithms have been investigated. Especially, LSH is getting highlighted recently, because it has a clear relationship between relative error ratio and the computational complexity. However, the p-stable LSH computes hash values independent of the data distributions, and hence, sometimes the search fails or consumes considerably long time. For solving this problem, we propose Principal Component Hashing (PCH), which exploits the distribution of the stored data. Through experiments, we confirmed that PCH is faster than ANN and LSH at the same accuracy.