Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Fast Pose Estimation with Parameter-Sensitive Hashing
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Locality preserving indexing for document representation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Content-based multimedia information retrieval: State of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Finding near-duplicate web pages: a large-scale evaluation of algorithms
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Learning to hash: forgiving hash functions and applications
Data Mining and Knowledge Discovery
Nearest-neighbor caching for content-match applications
Proceedings of the 18th international conference on World wide web
Self-taught hashing for fast similarity search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Laplacian co-hashing of terms and documents
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Hi-index | 0.00 |
The paper is concerned with similarity search at large scale, which efficiently and effectively finds similar data points for a query data point. An efficient way to accelerate similarity search is to learn hash functions. The existing approaches for learning hash functions aim to obtain low values of Hamming distances for the similar pairs. However, these methods ignore the ranking order of these Hamming distances. This leads to the poor accuracy about finding similar items for a query data point. In this paper, an algorithm is proposed, referred to top k RHS (Rank Hash Similarity), in which a ranking loss function is designed for learning a hash function. The hash function is hypothesized to be made up of l binary classifiers. The issue of learning a hash function can be formulated as a task of learning l binary classifiers. The algorithm runs l rounds and learns a binary classifier at each round. Compared with the existing approaches, the proposed method has the same order of computational complexity. Nevertheless, experiment results on three text datasets show that the proposed method obtains higher accuracy than the baselines.