Approximate string matching with ordered q-grams

  • Authors:
  • E. Sutinen;J. Tarhio

  • Affiliations:
  • -;-

  • Venue:
  • Nordic Journal of Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Approximate string matching with k differences is considered. Filtration of the text is a widely adopted technique to reduce the text area processed by dynamic programming. We present sublinear filtration algorithms based on the locations of q-grams in the pattern. Samples of q-grams are drawn from the text at fixed periods, and only if consecutive samples appear in the pattern approximately in the same configuration, the text area is examined with dynamic programming. The algorithm LEQ searches for exact occurrences of the pattern q-grams, whereas the algorithm LAQ searches for approximate occurrences of them. In addition, a static variation of LEQ is presented. The focus of the paper is on combinatorial properties of the sampling approach.