Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
A technique for counting ones in a binary computer
Communications of the ACM
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Machine Learning
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Locality-sensitive hashing scheme based on p-stable distributions
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Content-based multimedia information retrieval: State of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Cover trees for nearest neighbor
ICML '06 Proceedings of the 23rd international conference on Machine learning
User performance versus precision measures for simple search tasks
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Finding near-duplicate web pages: a large-scale evaluation of algorithms
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Principles of hash-based text retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Strategies for retrieving plagiarized documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Factorization meets the neighborhood: a multifaceted collaborative filtering model
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to hash: forgiving hash functions and applications
Data Mining and Knowledge Discovery
Nearest-neighbor caching for content-match applications
Proceedings of the 18th international conference on World wide web
International Journal of Approximate Reasoning
Self-taught hashing for fast similarity search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Scalable similarity search with optimized kernel hashing
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Shape google: Geometric words and expressions for invariant shape retrieval
ACM Transactions on Graphics (TOG)
Hi-index | 0.01 |
Fast similarity search has been a key step in many large-scale computer vision and information retrieval tasks. Recently, there are a surge of research interests on the hashing-based techniques to allow approximate but highly efficient similarity search. Most existing hashing methods are unsupervised, which demonstrate the promising performance using the information of unlabeled data to generate binary codes. In this paper, we propose a novel semi-supervised hashing method to take into account the pairwise supervised information including must-link and cannot-link, and then maximize the information provided by each bit according to both the labeled data and the unlabeled data. Different from previous works on semi-supervised hashing, we use the square of the Euclidean distance to measure the Hamming distance, which leads to a more general Laplacian matrix based solution after the relaxation by removing the binary constraints. We also relax the orthogonality constraints to reduce the error when converting the real-value solution to the binary one. The experimental evaluations on three benchmark datasets show the superior performance of the proposed method over the state-of-the-art approaches.