Configurations and minority in the string consensus problem

Authors:
Amihood Amir;Haim Paryenty;Liam Roditty
Affiliations:
Department of Computer Science, Bar Ilan University, Ramat Gan, Israel,Department of Computer Science, Johns Hopkins University, Baltimore, MD;Department of Computer Science, Bar Ilan University, Ramat Gan, Israel;Department of Computer Science, Bar Ilan University, Ramat Gan, Israel
Venue:
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Year:
2012

Citing 13
Cited 0

On the closest string and substring problems

Journal of the ACM (JACM)
A Linear-Time Algorithm for the 1-Mismatch Problem

WADS '97 Proceedings of the 5th International Workshop on Algorithms and Data Structures
Banishing Bias from Consensus Sequences

CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Distinguishing string selection problems

Information and Computation
On the Optimality of the Dimensionality Reduction Method

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Optimal Solutions for the Closest-String Problem via Integer Programming

INFORMS Journal on Computing
On the Structure of Small Motif Recognition Instances

SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Exact Solutions for Closest String and Related Problems

ISAAC '01 Proceedings of the 12th International Symposium on Algorithms and Computation
Consensus Optimizing Both Distance Sum and Radius

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Swiftly computing center strings

WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
More Efficient Algorithms for Closest String and Substring Problems

SIAM Journal on Computing
Why large CLOSEST STRING instances are easy to solve in practice

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Approximations and partial solutions for the consensus sequence problem

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Closest String Problem is defined as follows. Let S be a set of k strings {s1,…sk}, each of length ℓ, find a string $\hat{s}$, such that the maximum Hamming distance of $\hat{s}$ from each of the strings is minimized. We denote this distance with d. The string $\hat{s}$ is called a consensus string. In this paper we present two main algorithms, the Configuration algorithm with O(k2 ℓ k) running time for this problem, and the Minority algorithm. The problem was introduced by Lanctot, Li, Ma, Wang and Zhang [13]. They showed that the problem is $\cal{NP}$-hard and provided an IP approximation algorithm. Since then the closest string problem has been studied extensively. This research can be roughly divided into three categories: Approximate, exact and practical solutions. This paper falls under the exact solutions category. Despite the great effort to obtain efficient algorithms for this problem an algorithm with the natural running time of O(ℓ k) was not known. In this paper we close this gap. Our result means that algorithms solving the closest string problem in times O(ℓ2), O(ℓ3), O(ℓ4) and O(ℓ5) exist for the cases of k=2,3,4 and 5, respectively. It is known that, in fact, the cases of k=2,3, and 4 can be solved in linear time. No efficient algorithm is currently known for the case of k=5. We prove the minority lemma that exploit surprising properties of the closest string problem and enable constructing the closest string in a sequential fashion. This lemma with some additional ideas give an O(ℓ2) time algorithm for computing a closest string of 5 binary strings.