On the complexity of finding common approximate substrings

  • Authors:
  • Patricia A. Evans;Andrew D. Smith;H. Todd Wareham

  • Affiliations:
  • Faculty of Computer Science, University of New Brunswick, P.O. Box 4400, Fredericton, NB, Canada, E3B 5A3;Faculty of Computer Science, University of New Brunswick, P.O. Box 4400, Fredericton, NB, Canada, E3B 5A3;Department of Computer Science, Memorial University of Newfoundland, St. John's NF, Canada, A1B 3X5

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2003

Quantified Score

Hi-index 5.23

Visualization

Abstract

Problems associated with finding strings that are within a specified Hamming distance of a given set of strings occur in several disciplines. In this paper, we use techniques from parameterized complexity to assess non-polynomial time algorithmic options and complexity for the COMMON APPROXIMATE SUBSTRING (CAS) problem. Our analyses indicate under which parameter restrictions useful algorithms are possible, and include both class membership and parameterized reductions to prove class hardness. In order to achieve fixed-parameter tractability, either a fixed string length or both a fixed size alphabet and fixed substring length are sufficient. Fixing either the string length or the alphabet size and Hamming distance is shown to be necessary, unless W[1] = FPT. An assortment of parameterized class membership and hardness results cover all other parameterized variants, showing in particular the effect of fixing the number of strings.