Optimal hash functions for approximate matches on the n-cube

  • Authors:
  • Daniel M. Gordon;Victor S. Miller;Peter Ostapenko

  • Affiliations:
  • IDA Center for Commumcations Research, San Diego, CA;IDA Center for Communications Research, Princeton, NJ;IDA Center for Commumcations Research, San Diego, CA

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2010

Quantified Score

Hi-index 754.84

Visualization

Abstract

One way to find near-matches in large datasets is to use hash functions. In recent years locality-sensitive hash functions for various metrics have been given; for the Hamming metric projecting onto k bits is simple hash function that performs well. In this paper, we investigate alternatives to projection. For various parameters hash functions given by complete decoding algorithms for error-correcting codes work better, and asymptotically random codes perform better than projection.