K-d trees for semidynamic point sets
SCG '90 Proceedings of the sixth annual symposium on Computational geometry
Point location in arrangements of hyperplanes
Information and Computation
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Unsupervised Segmentation of Color-Texture Regions in Images and Video
IEEE Transactions on Pattern Analysis and Machine Intelligence
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Deflating the Dimensionality Curse Using Multiple Fractal Dimensions
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Rotation invariant spherical harmonic representation of 3D shape descriptors
Proceedings of the 2003 Eurographics/ACM SIGGRAPH symposium on Geometry processing
MARSYAS: a framework for audio analysis
Organised Sound
MARSYAS: a framework for audio analysis
Organised Sound
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Image similarity search with compact data structures
Proceedings of the thirteenth ACM international conference on Information and knowledge management
LSH forest: self-tuning indexes for similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Entropy based nearest neighbor search in high dimensions
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Ferret: a toolkit for content-based similarity search of feature-rich data
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Sizing sketches: a rank-based analysis for similarity search
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Multi-probe LSH: efficient indexing for high-dimensional similarity search
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Scalable clip-based near-duplicate video detection with ordinal measure
Proceedings of the ACM International Conference on Image and Video Retrieval
Towards optimal naive bayes nearest neighbor
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
On classifying drifting concepts in P2P networks
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Randomly projected KD-trees with distance metric learning for image retrieval
MMM'11 Proceedings of the 17th international conference on Advances in multimedia modeling - Volume Part II
Efficient k-nearest neighbor graph construction for generic similarity measures
Proceedings of the 20th international conference on World wide web
Stabilizing the recall in similarity search
Proceedings of the Fourth International Conference on SImilarity Search and APplications
Mining weakly labeled web facial images for search-based face annotation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Retrieval-based face annotation by weak label regularized local coordinate coding
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Fast GPU-based locality sensitive hashing for k-nearest neighbor computation
Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Distributed similarity estimation using derived dimensions
The VLDB Journal — The International Journal on Very Large Data Bases
SIMP: accurate and efficient near neighbor search in high dimensional spaces
Proceedings of the 15th International Conference on Extending Database Technology
FANS: face annotation by searching large-scale web facial images
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
Although Locality-Sensitive Hashing (LSH) is a promising approach to similarity search in high-dimensional spaces, it has not been considered practical partly because its search quality is sensitive to several parameters that are quite data dependent. Previous research on LSH, though obtained interesting asymptotic results, provides little guidance on how these parameters should be chosen, and tuning parameters for a given dataset remains a tedious process. To address this problem, we present a statistical performance model of Multi-probe LSH, a state-of-the-art variance of LSH. Our model can accurately predict the average search quality and latency given a small sample dataset. Apart from automatic parameter tuning with the performance model, we also use the model to devise an adaptive LSH search algorithm to determine the probing parameter dynamically for each query. The adaptive probing method addresses the problem that even though the average performance is tuned for optimal, the variance of the performance is extremely high. We experimented with three different datasets including audio, images and 3D shapes to evaluate our methods. The results show the accuracy of the proposed model: the recall errors predicted are within 5% from the real values for most cases; the adaptive search method reduces the standard deviation of recall by about 50% over the existing method.