Active hashing and its application to image and text retrieval

  • Authors:
  • Yi Zhen;Dit-Yan Yeung

  • Affiliations:
  • Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Kowloon, China;Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Kowloon, China

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, hashing-based methods for large-scale similarity search have sparked considerable research interests in the data mining and machine learning communities. While unsupervised hashing-based methods have achieved promising successes for metric similarity, they cannot handle semantic similarity which is usually given in the form of labeled point pairs. To overcome this limitation, some attempts have recently been made on semi-supervised hashing which aims at learning hash functions from both metric and semantic similarity simultaneously. Existing semi-supervised hashing methods can be regarded as passive hashing since they assume that the labeled pairs are provided in advance. In this paper, we propose a novel framework, called active hashing, which can actively select the most informative labeled pairs for hash function learning. Specifically, it identifies the most informative points to label and constructs labeled pairs accordingly. Under this framework, we use data uncertainty as a measure of informativeness and develop a batch mode algorithm to speed up active selection. We empirically compare our method with a state-of-the-art passive hashing method on two benchmark data sets, showing that the proposed method can reduce labeling cost as well as overcome the limitations of passive hashing.