Efficient Algorithms for the Closest String and Distinguishing String Selection Problems
FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
Average parameterization and partial kernelization for computing medians
Journal of Computer and System Sciences
A three-string approach to the closest string problem
Journal of Computer and System Sciences
Average parameterization and partial kernelization for computing medians
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Slightly superexponential parameterized problems
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
What's next? future directions in parameterized complexity
The Multivariate Algorithmic Revolution and Beyond
The parameterized complexity of the shared center problem
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
On approximating string selection problems with outliers
Theoretical Computer Science
Hi-index | 0.00 |
We study two pattern matching problems that are motivated by applications in computational biology. In the Closest Substring problem $k$ strings $s_1,\dots, s_k$ are given, and the task is to find a string $s$ of length $L$ such that each string $s_i$ has a consecutive substring of length $L$ whose distance is at most $d$ from $s$. We present two algorithms that aim to be efficient for small fixed values of $d$ and $k$: for some functions $f$ and $g$, the algorithms have running time $f(d)\cdot n^{O(\log d)}$ and $g(d,k)\cdot n^{O(\log\log k)}$, respectively. The second algorithm is based on connections with the extremal combinatorics of hypergraphs. The Closest Substring problem is also investigated from the parameterized complexity point of view. Answering an open question from [P. A. Evans, A. D. Smith, and H. T. Wareham, Theoret. Comput. Sci., 306 (2003), pp. 407-430, M. R. Fellows, J. Gramm, and R. Niedermeier, Combinatorica, 26 (2006), pp. 141-167, J. Gramm, J. Guo, and R. Niedermeier, Lecture Notes in Comput. Sci. 2751, Springer, Berlin, 2003, pp. 195-209, J. Gramm, R. Niedermeier, and P. Rossmanith, Algorithmica, 37 (2003), pp. 25-42], we show that the problem is W[1]-hard even if both $d$ and $k$ are parameters. It follows as a consequence of this hardness result that our algorithms are optimal in the sense that the exponent of $n$ in the running time cannot be improved to $o(\log d)$ or to $o(\log \log k)$ (modulo some complexity-theoretic assumptions). Consensus Patterns is the variant of the problem where, instead of the requirement that each $s_i$ has a substring that is of distance at most $d$ from $s$, we have to select the substrings in such a way that the average of these $k$ distances is at most $\delta$. By giving an $f(\delta)\cdot n^9$ time algorithm, we show that the problem is fixed-parameter tractable. This answers an open question from [M. R. Fellows, J. Gramm, and R. Niedermeier, Combinatorica, 26 (2006), pp. 141-167].