More efficient algorithms for closest string and substring problems

Authors:
Bin Ma;Xiaoming Sun
Affiliations:
Department of Computer Science, University of Western Ontario, London, ON, Canada;Center for Advanced Study and Institute for Theoretical Computer Science, Tsinghua University, Beijing, China
Venue:
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Year:
2008

Citing 18
Cited 14

Probabilistic construction of deterministic algorithms: approximating packing integer programs

Journal of Computer and System Sciences - 27th IEEE Conference on Foundations of Computer Science October 27-29, 1986
Approximation algorithms for NP-hard problems

Approximation algorithms for NP-hard problems
Finding similar regions in many strings

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Distinguishing string selection problems

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
On the closest string and substring problems

Journal of the ACM (JACM)
A Linear-Time Algorithm for the 1-Mismatch Problem

WADS '97 Proceedings of the 5th International Workshop on Algorithms and Data Structures
Banishing Bias from Consensus Sequences

CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Genetic Algorithm Approach for the Closest String Problem

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
On the complexity of finding common approximate substrings

Theoretical Computer Science
Hard problems in similarity searching

Discrete Applied Mathematics - Discrete mathematics & data mining (DM & DM)
The Closest Substring problem with small distances

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
On The Parameterized Intractability Of Motif Search Problems*

Combinatorica
On the Optimality of the Dimensionality Reduction Method

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Optimal Solutions for the Closest-String Problem via Integer Programming

INFORMS Journal on Computing
Complexities of the centre and median string problems

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Parallel genetic algorithm and parallel simulated annealing algorithm for the closest string problem

ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Space and time efficient algorithms for planted motif search

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Parameterized Complexity

Parameterized Complexity

Efficient Algorithms for the Closest String and Distinguishing String Selection Problems

FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
Consensus Optimizing Both Distance Sum and Radius

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Ant-CSP: An Ant Colony Optimization Algorithm for the Closest String Problem

SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Finding optimal alignment and consensus of circular strings

CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
A three-string approach to the closest string problem

COCOON'10 Proceedings of the 16th annual international conference on Computing and combinatorics
Efficient computation of approximate gene clusters based on reference occurrences

RECOMB-CG'10 Proceedings of the 2010 international conference on Comparative genomics
Why large CLOSEST STRING instances are easy to solve in practice

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
On the hardness of counting and sampling center strings

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
A GRASP algorithm for the Closest String Problem using a probability-based heuristic

Computers and Operations Research
A heuristic algorithm based on Lagrangian relaxation for the closest string problem

Computers and Operations Research
Approximations and partial solutions for the consensus sequence problem

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
An improved heuristic for the far from most strings problem

Journal of Heuristics
On the Hardness of Counting and Sampling Center Strings

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Finding consensus and optimal alignment of circular strings

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

The closest string and substring problems find applications in PCR primer design, genetic probe design, motif finding, and antisense drug design. For their importance, the two problems have been extensively studied recently in computational biology. Unfortunately both problems are NP-complete. Researchers have developed both fixed-parameter algorithms and approximation algorithms for the two problems. In terms of fixed-parameter, when the radius d is the parameter, the best-known fixed-parameter algorithm for closest string has time complexity O(ndd+1), which is still superpolynomial even if d = O(log n). In this paper we provide an O(n|Σ|O(d)) algorithm where Σ is the alphabet. This gives a polynomial time algorithm when d = O(log n) and Σ has constant size. Using the same technique, we additionally provide a more efficient subexponential time algorithm for the closest substring problem. In terms of approximation, both closest string and closest substring problems admit polynomial time approximation schemes (PTAS). The best known time complexity of the PTAS is O(nO(Ɛ-2 log 1/Ɛ)). In this paper we present a PTAS with time complexity O(nO(Ɛ-2)). At last, we prove that a restricted version of the closest substring has the same parameterized complexity as closest substring, answering an open question in the literature.