String matching with inversions and translocations in linear average time (most of the time)

  • Authors:
  • Szymon Grabowski;Simone Faro;Emanuele Giaquinta

  • Affiliations:
  • Technical University of Łód, Computer Engineering Department, Al. Politechniki 11, 90-924 Łód, Poland;Universití di Catania, Dipartimento di Matematica e Informatica, Viale Andrea Doria 6, I-95125 Catania, Italy;Universití di Catania, Dipartimento di Matematica e Informatica, Viale Andrea Doria 6, I-95125 Catania, Italy

  • Venue:
  • Information Processing Letters
  • Year:
  • 2011

Quantified Score

Hi-index 0.89

Visualization

Abstract

We present an efficient algorithm for finding all approximate occurrences of a given pattern p of length m in a text t of length n allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an O(nmmax(@a,@b))-time complexity in the worst case and O(max(@a,@b,@s))-space complexity, where @a and @b are respectively the maximum length of the factors involved in any translocation and inversion, and @s is the alphabet size. Moreover we show that our algorithm has an O(n) average time complexity, whenever @s=@W(logm/loglog^1^-^@em), for @e0. Experiments show that the proposed algorithm achieves very good results in practical cases.