Variants of constrained longest common subsequence

  • Authors:
  • Paola Bonizzoni;Gianluca Della Vedova;Riccardo Dondi;Yuri Pirola

  • Affiliations:
  • DISCo, Universití degli Studi di Milano-Bicocca, Milano, Italy;Dipartimento di Statistica, Universití degli Studi di Milano-Bicocca, Milano, Italy;Dipartimento di Scienze dei Linguaggi, della Comunicazione e degli Studi Culturali, Universití degli Studi di Bergamo, Bergamo, Italy;DISCo, Universití degli Studi di Milano-Bicocca, Milano, Italy

  • Venue:
  • Information Processing Letters
  • Year:
  • 2010

Quantified Score

Hi-index 0.89

Visualization

Abstract

We consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s"1 and s"2 over an alphabet @S, a set C"s of strings, and a function C"o:@S-N, the DC-LCS problem consists of finding the longest subsequence s of s"1 and s"2 such that s is a supersequence of all the strings in C"s and such that the number of occurrences in s of each symbol @s@?@S is upper bounded by C"o(@s). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem that have been introduced previously in the literature: the Constrained LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution which is also applicable to the more specialized problems. Second, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings (|C"s|) and the size of the alphabet @S. This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP-hardness when the size of the alphabet is constant.