NV-Tree: nearest neighbors at the billion scale

Authors:
Herwig Lejsek;Björn Þór Jónsson;Laurent Amsaleg
Affiliations:
Videntifier Technologies Reykjavík, Iceland;Reykjavik University, Iceland;IRISA--CNRS Rennes, France
Venue:
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Year:
2011

Citing 9
Cited 2

Efficient similarity search and classification via rank aggregation

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Theory of nearest neighbors indexability

ACM Transactions on Database Systems (TODS)
A posteriori multi-probe locality sensitive hashing

MM '08 Proceedings of the 16th ACM international conference on Multimedia
NV-Tree: An Efficient Disk-Based Index for Approximate Search in Very Large High-Dimensional Collections

IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluation of GIST descriptors for web-scale image search

Proceedings of the ACM International Conference on Image and Video Retrieval
Building a web-scale image similarity search system

Multimedia Tools and Applications
Videntifier" Forensic: large-scale video identification in practice

Proceedings of the 2nd ACM workshop on Multimedia in forensics, security and intelligence

Efficient image signatures and similarities using tensor products of local descriptors

Computer Vision and Image Understanding
Indexing and searching 100M images with map-reduce

Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents the NV-Tree (Nearest Vector Tree). It addresses the specific, yet important, problem of efficiently and effectively finding the approximate k-nearest neighbors within a collection of a few billion high-dimensional data points. The NV-Tree is a very compact index, as only six bytes are kept in the index for each high-dimensional descriptor. It thus scales extremely well when indexing large collections of high-dimensional descriptors. The NV-Tree efficiently produces results of good quality, even at such a large scale that the indices cannot be kept entirely in main memory any more. We demonstrate this with extensive experiments using a collection of 2.5 billion SIFT (Scale Invariant Feature Transform) descriptors.