Sparse hashing for fast multimedia search

Authors:
Xiaofeng Zhu;Zi Huang;Hong Cheng;Jiangtao Cui;Heng Tao Shen
Affiliations:
The University of Queensland, Australia;The University of Queensland, Australia;The Chinese University of Hong Kong;Xidian University, China;The University of Queensland, Australia
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2013

Citing 28
Cited 1

Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Introduction to Algorithms

Introduction to Algorithms
Locality-sensitive hashing scheme based on p-stable distributions

SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Non-negative Matrix Factorization with Sparseness Constraints

The Journal of Machine Learning Research
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

The Journal of Machine Learning Research
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

The Journal of Machine Learning Research
Self-taught learning: transfer learning from unlabeled data

Proceedings of the 24th international conference on Machine learning
Principles of hash-based text retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Self-taught clustering

Proceedings of the 25th international conference on Machine learning
Bounded coordinate system indexing for real-time video clip search

ACM Transactions on Information Systems (TOIS)
Semantic hashing

International Journal of Approximate Reasoning
Quality and efficiency in high dimensional nearest neighbor search

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
QUC-tree: integrating query context information for efficient music retrieval

IEEE Transactions on Multimedia - Special issue on integration of context and content
Exponential family sparse coding with applications to self-taught learning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Online Learning for Matrix Factorization and Sparse Coding

The Journal of Machine Learning Research
Self-taught hashing for fast similarity search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Sparse CCA using a Lasso with positivity constraints

Computational Statistics & Data Analysis
Real-time large scale near-duplicate web video retrieval

Proceedings of the international conference on Multimedia
Beyond "Near Duplicates: Learning Hash Codes for Efficient Similar-Image Retrieval

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Product Quantization for Nearest Neighbor Search

IEEE Transactions on Pattern Analysis and Machine Intelligence
Effective data co-reduction for multimedia similarity search

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On the grouped selection and model complexity of the adaptive elastic net

Statistics and Computing
Composite hashing with multiple information sources

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Multiple feature hashing for real-time large scale near-duplicate video retrieval

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Laplacian co-hashing of terms and documents

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Compact hashing with joint optimization of search accuracy and time

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Graph Regularized Sparse Coding for Image Representation

IEEE Transactions on Image Processing

Linear cross-modal hashing for efficient multimedia search

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hash-based methods achieve fast similarity search by representing high-dimensional data with compact binary codes. However, both generating binary codes and encoding unseen data effectively and efficiently remain very challenging tasks. In this article, we focus on these tasks to implement approximate similarity search by proposing a novel hash based method named sparse hashing (SH for short). To generate interpretable (or semantically meaningful) binary codes, the proposed SH first converts original data into low-dimensional data through a novel nonnegative sparse coding method. SH then converts the low-dimensional data into Hamming space (i.e., binary encoding low-dimensional data) by a new binarization rule. After this, training data are represented by generated binary codes. To efficiently and effectively encode unseen data, SH learns hash functions by taking a-priori knowledge into account, such as implicit group effect of the features in training data, and the correlations between original space and the learned Hamming space. SH is able to perform fast approximate similarity search by efficient bit XOR operations in the memory of a modern PC with short binary code representations. Experimental results show that the proposed SH significantly outperforms state-of-the-art techniques.