Fast Approximate Point Set Matching for Information Retrieval

  • Authors:
  • Raphaël Clifford;Benjamin Sach

  • Affiliations:
  • University of Bristol, Department of Computer Science, Woodland Road, Bristol, BS8 1UB, UK;University of Bristol, Department of Computer Science, Woodland Road, Bristol, BS8 1UB, UK

  • Venue:
  • SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate randomised algorithms for subset matching with spatial point sets--given two sets of d-dimensional points: a data set Tconsisting of npoints and a pattern Pconsisting of mpoints, find the largest match for a subset of the pattern in the data set. This problem is known to be 3-SUM hard and so unlikely to be solvable exactly in subquadratic time. We present an efficient bit-parallel O(nm) time algorithm and an O(nlogm) time solution based on correlation calculations using fast Fourier transforms. Both methods are shown experimentally to give answers within a few percent of the exact solution and provide a considerable practical speedup over existing deterministic algorithms.