Restricted transposition invariant approximate string matching under edit distance

  • Authors:
  • Heikki Hyyrö

  • Affiliations:
  • Department of Computer Sciences, University of Tampere, Finland

  • Venue:
  • SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let A and B be strings with lengths m and n, respectively, over a finite integer alphabet. Two classic string mathing problems are computing the edit distance between A and B, and searching for approximate occurrences of A inside B. We consider the classic Levenshtein distance, but the discussion is applicable also to indel distance. A relatively new variant [8] of string matching, motivated initially by the nature of string matching in music, is to allow transposition invariance for A. This means allowing A to be “shifted” by adding some fixed integer t to the values of all its characters: the underlying string matching task must then consider all possible values of t. Mäkinen et al. [12,13] have recently proposed O(mn loglog m) and O(dn loglog m) algorithms for transposition invariant edit distance computation, where d is the transposition invariant distance between A and B, and an O(mn loglog m) algorithm for transposition invariant approximate string matching. In this paper we first propose a scheme to construct transposition invariant algorithms that depend on d or k. Then we proceed to give an O(n + d3) algorithm for transposition invariant edit distance, and an O(k2n) algorithm for transposition invariant approximate string matching.