Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Multiple view geometry in computer visiond
Multiple view geometry in computer visiond
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Shape Indexing Using Approximate Nearest-Neighbour Search in High-Dimensional Spaces
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Audio Fingerprinting: Nearest Neighbor Search in High Dimensional Binary Spaces
Journal of VLSI Signal Processing Systems
Scalable Recognition with a Vocabulary Tree
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Learning task-specific similarity
Learning task-specific similarity
A Branch and Bound Algorithm for Computing k-Nearest Neighbors
IEEE Transactions on Computers
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
International Journal of Approximate Reasoning
BRIEF: binary robust independent elementary features
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Discriminative Learning of Local Image Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
LDAHash: Improved Matching with Smaller Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
SURF: speeded up robust features
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
Iterative quantization: A procrustean approach to learning binary codes
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Hi-index | 0.10 |
Binary descriptors allow faster similarity computation than real-valued ones while requiring much less storage. As a result, many algorithms have recently been proposed to binarize floating-point descriptors so that they can be searched for quickly. Unfortunately, even if the similarity between vectors can be computed fast, exhaustive linear search remains impractical for truly large databases and approximate nearest neighbor (ANN) search is still required. It is therefore surprising that relatively little attention has been paid to the efficiency of ANN algorithms on binary vectors and this is the focus of this paper. We first show that binary-space Voronoi diagrams have thick boundaries, meaning that there are many points that lie at the same distance from two random points. This violates the implicit assumption made by most ANN algorithms that points can be neatly assigned to clusters centered around a set of cluster centers. As a result, state-of-the-art algorithms that can operate on binary vectors exhibit much lower performance than those that work with floating point ones. The above analysis is the first contribution of the paper. The second one is two effective ways to overcome this limitation, by appropriately randomizing either a tree-based algorithm or hashing-based one. In both cases, we show that we obtain precision/recall curves that are similar to those than can be obtained using floating point number calculation, but at much reduced computational cost.