Efficient exact edit similarity query processing with the asymmetric signature scheme

  • Authors:
  • Jianbin Qin;Wei Wang;Yifei Lu;Chuan Xiao;Xuemin Lin

  • Affiliations:
  • University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales, Sydney, Australia;University of New South Wales & East Normal China University, Sydney, Australia

  • Venue:
  • Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a query string Q, an edit similarity search finds all strings in a database whose edit distance with Q is no more than a given threshold t. Most existing method answering edit similarity queries rely on a signature scheme to generate candidates given the query string. We observe that the number of signatures generated by existing methods is far greater than the lower bound, and this results in high query time and index space complexities. In this paper, we show that the minimum signature size lower bound is t +1. We then propose asymmetric signature schemes that achieve this lower bound. We develop efficient query processing algorithms based on the new scheme. Several dynamic programming-based candidate pruning methods are also developed to further speed up the performance. We have conducted a comprehensive experimental study involving nine state-of-the-art algorithms. The experiment results clearly demonstrate the efficiency of our methods.