Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Introduction to Algorithms
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Non-negative Matrix Factorization with Sparseness Constraints
The Journal of Machine Learning Research
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
The Journal of Machine Learning Research
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
The Journal of Machine Learning Research
Self-taught learning: transfer learning from unlabeled data
Proceedings of the 24th international conference on Machine learning
Principles of hash-based text retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Proceedings of the 25th international conference on Machine learning
Bounded coordinate system indexing for real-time video clip search
ACM Transactions on Information Systems (TOIS)
International Journal of Approximate Reasoning
Quality and efficiency in high dimensional nearest neighbor search
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
QUC-tree: integrating query context information for efficient music retrieval
IEEE Transactions on Multimedia - Special issue on integration of context and content
Exponential family sparse coding with applications to self-taught learning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Online Learning for Matrix Factorization and Sparse Coding
The Journal of Machine Learning Research
Self-taught hashing for fast similarity search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Sparse CCA using a Lasso with positivity constraints
Computational Statistics & Data Analysis
Real-time large scale near-duplicate web video retrieval
Proceedings of the international conference on Multimedia
Beyond "Near Duplicates: Learning Hash Codes for Efficient Similar-Image Retrieval
ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Product Quantization for Nearest Neighbor Search
IEEE Transactions on Pattern Analysis and Machine Intelligence
Effective data co-reduction for multimedia similarity search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On the grouped selection and model complexity of the adaptive elastic net
Statistics and Computing
Composite hashing with multiple information sources
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Multiple feature hashing for real-time large scale near-duplicate video retrieval
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Laplacian co-hashing of terms and documents
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Compact hashing with joint optimization of search accuracy and time
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Graph Regularized Sparse Coding for Image Representation
IEEE Transactions on Image Processing
Linear cross-modal hashing for efficient multimedia search
Proceedings of the 21st ACM international conference on Multimedia
Hi-index | 0.00 |
Hash-based methods achieve fast similarity search by representing high-dimensional data with compact binary codes. However, both generating binary codes and encoding unseen data effectively and efficiently remain very challenging tasks. In this article, we focus on these tasks to implement approximate similarity search by proposing a novel hash based method named sparse hashing (SH for short). To generate interpretable (or semantically meaningful) binary codes, the proposed SH first converts original data into low-dimensional data through a novel nonnegative sparse coding method. SH then converts the low-dimensional data into Hamming space (i.e., binary encoding low-dimensional data) by a new binarization rule. After this, training data are represented by generated binary codes. To efficiently and effectively encode unseen data, SH learns hash functions by taking a-priori knowledge into account, such as implicit group effect of the features in training data, and the correlations between original space and the learned Hamming space. SH is able to perform fast approximate similarity search by efficient bit XOR operations in the memory of a modern PC with short binary code representations. Experimental results show that the proposed SH significantly outperforms state-of-the-art techniques.