Error-correcting output hashing in fast similarity search

  • Authors:
  • Zhou Yu;Deng Cai;Xiaofei He

  • Affiliations:
  • Zhejiang University, China;Zhejiang University, China;Zhejiang University, China

  • Venue:
  • ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
  • Year:
  • 2010
  • Harmonious hashing

    IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fast similarity search is one of the key techniques in many large scale learning and data mining applications. Recently, hashing-based methods, which create compact and efficient codes that preserve data distribution, have received considerable attention due to their promising theoretical and empirical results. An ideal hashing method 1) can naturally have out-of-sample extension; 2) has very low computational complexity; and 3) has significant improvement over linear search in the original space in terms of accuracy. However, most existing hashing methods failed to satisfy all the above three requirements. In this paper, we propose a new method called Error-correcting Output Hashing (ECOH) which meets all the above three requirements. ECOH first groups all the samples into clusters using a conventional clustering algorithm. Each cluster is assigned an Error-Correcting Output Code (ECOC) and the linear mappings from the sample vectors to the ECOC are then learned using linear regression models. In this way, ECOH learns both the binary code for each sample and the function which links the input vector and the output code. Experimental results on real world data sets demonstrate the effectiveness and efficiency of the proposed approach.