Text searching allowing for inversions and translocations of factors

Authors:
Domenico Cantone;Simone Faro;Emanuele Giaquinta
Affiliations:
-;-;-
Venue:
Discrete Applied Mathematics
Year:
2014

Citing 9
Cited 0

Transducers and repetitions

Theoretical Computer Science
Text algorithms

Text algorithms
A comparison of approximate string matching algorithms

Software—Practice & Experience
A technique for computer detection and correction of spelling errors

Communications of the ACM
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)
New and faster filters for multiple approximate string matching

Random Structures & Algorithms
String matching with inversions and translocations in linear average time (most of the time)

Information Processing Letters
Alignment with non-overlapping inversions in O(n3)-time

WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Efficient string-matching allowing for non-overlapping inversions

Theoretical Computer Science

Quantified Score

Hi-index	0.04

Visualization

Abstract

The approximate string matching problem consists in finding all locations at which a pattern p of length m matches a substring of a text t of length n, after a finite number of given edit operations. In this paper, we investigate such a problem when the edit operations are translocations of adjacent factors of equal length and inversions of factors. In particular, we first present an O(nmmax(@a,@b))-time and O(m^2)-space algorithm, where @a and @b are respectively the maximum lengths of the factors which can be involved in any translocation and inversion, and show that under the assumptions of equiprobability and independence of characters our algorithm has a O(nlog"@sm) average time complexity, for an alphabet of size @s. We also present a very fast variant of a recently proposed algorithm for the same problem, based on an efficient filtering method, which has a O(n)-time complexity in the average case, though in the worst case it retains the same O(nmmax(@a,@b))-time complexity.