Rank hash similarity for fast similarity search

Authors:
Min Lu;Yalou Huang;Maoqiang Xie;Jie Liu
Affiliations:
College of Information Technical Science, Nankai University, Tianjin, China;College of Information Technical Science, Nankai University, Tianjin, China and College of Software, Nankai University, Tianjin, China;College of Software, Nankai University, Tianjin, China and Information Technology Research Base of Civil Aviation Administration of China, Civil Aviation University of China, China;College of Information Technical Science, Nankai University, Tianjin, China
Venue:
Information Processing and Management: an International Journal
Year:
2013

Citing 12
Cited 0

Using linear algebra for intelligent information retrieval

SIAM Review
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Fast Pose Estimation with Parameter-Sensitive Hashing

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Locality preserving indexing for document representation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Finding near-duplicate web pages: a large-scale evaluation of algorithms

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Learning to hash: forgiving hash functions and applications

Data Mining and Knowledge Discovery
Nearest-neighbor caching for content-match applications

Proceedings of the 18th international conference on World wide web
Self-taught hashing for fast similarity search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Laplacian co-hashing of terms and documents

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper is concerned with similarity search at large scale, which efficiently and effectively finds similar data points for a query data point. An efficient way to accelerate similarity search is to learn hash functions. The existing approaches for learning hash functions aim to obtain low values of Hamming distances for the similar pairs. However, these methods ignore the ranking order of these Hamming distances. This leads to the poor accuracy about finding similar items for a query data point. In this paper, an algorithm is proposed, referred to top k RHS (Rank Hash Similarity), in which a ranking loss function is designed for learning a hash function. The hash function is hypothesized to be made up of l binary classifiers. The issue of learning a hash function can be formulated as a task of learning l binary classifiers. The algorithm runs l rounds and learns a binary classifier at each round. Compared with the existing approaches, the proposed method has the same order of computational complexity. Nevertheless, experiment results on three text datasets show that the proposed method obtains higher accuracy than the baselines.