The nature of statistical learning theory
The nature of statistical learning theory
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope
International Journal of Computer Vision
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Photo tourism: exploring photo collections in 3D
ACM SIGGRAPH 2006 Papers
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Comparison of Descriptor Spaces for Chemical Compound Retrieval and Classification
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Locality sensitive hash functions based on concomitant rank order statistics
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning reconfigurable hashing for diverse semantics
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Fast locality-sensitive hashing
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Compact hashing for mixed image-keyword query over multi-label images
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Learning binary codes for collaborative filtering
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A probabilistic model for multimodal hash function learning
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Manhattan hashing for large-scale image retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Boosting multi-kernel locality-sensitive hashing for scalable image retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Query-driven iterated neighborhood graph search for large scale indexing
Proceedings of the 20th ACM international conference on Multimedia
Submodular video hashing: a unified framework towards video pooling and indexing
Proceedings of the 20th ACM international conference on Multimedia
Compact kernel hashing with multiple features
Proceedings of the 20th ACM international conference on Multimedia
Semi-supervised spectral hashing for fast similarity search
Neurocomputing
Active hashing and its application to image and text retrieval
Data Mining and Knowledge Discovery
Least square regularized spectral hashing for similarity search
Signal Processing
Optimal hashing schemes for entity matching
Proceedings of the 22nd international conference on World Wide Web
Order preserving hashing for approximate nearest neighbor search
Proceedings of the 21st ACM international conference on Multimedia
Mixed image-keyword query adaptive hashing over multilabel images
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Multiple feature kernel hashing for large-scale visual search
Pattern Recognition
Hi-index | 0.00 |
Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creating compact and efficient hash codes that preserve data similarity. By efficient, we refer to the low correlation (and thus low redundancy) among generated codes. However, most existing hash methods are designed only for vector data. In this paper, we develop a new hashing algorithm to create efficient codes for large scale data of general formats with any kernel function, including kernels on vectors, graphs, sequences, sets and so on. Starting with the idea analogous to spectral hashing, novel formulations and solutions are proposed such that a kernel based hash function can be explicitly represented and optimized, and directly applied to compute compact hash codes for new samples of general formats. Moreover, we incorporate efficient techniques, such as Nystrom approximation, to further reduce time and space complexity for indexing and search, making our algorithm scalable to huge data sets. Another important advantage of our method is the ability to handle diverse types of similarities according to actual task requirements, including both feature similarities and semantic similarities like label consistency. We evaluate our method using both vector and non-vector data sets at a large scale up to 1 million samples. Our comprehensive results show the proposed method outperforms several state-of-the-art approaches for all the tasks, with a significant gain for most tasks.