Trends in computational biology (abstract)
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Topology of strings: median string is NP-complete
Theoretical Computer Science
Finding similar regions in many sequences
Journal of Computer and System Sciences - STOC 1999
Banishing Bias from Consensus Sequences
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
On the classification and aggregation of hierarchies with different constitutive elements
Fundamenta Informaticae
Distinguishing string selection problems
Information and Computation
An efficient approach for the rank aggregation problem
Theoretical Computer Science
A Low-complexity Distance for DNA Strings
Fundamenta Informaticae
Multiple genome rearrangement by swaps and by element duplications
Theoretical Computer Science
Complexities of the centre and median string problems
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Parallel genetic algorithm and parallel simulated annealing algorithm for the closest string problem
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Discrete Optimization
Clustering based on rank distance with applications on DNA
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
An efficient two-phase ant colony optimization algorithm for the closest string problem
SEAL'12 Proceedings of the 9th international conference on Simulated Evolution and Learning
Hi-index | 0.00 |
Given a set S of k strings of maximum length n, the goal of the closest substring problem (CSSP) is to find the smallest integer d (and a corresponding string t of length ℓ≤n) such that each string s∈S has a substring of length ℓ of "distance" at most d to t. The closest string problem (CSP) is a special case of CSSP where ℓ=n. CSP and CSSP arise in many applications in bioinformatics and are extensively studied in the context of Hamming and edit distance. In this paper we consider a recently introduced distance measure, namely the rank distance. First, we show that the CSP and CSSP via rank distance are NP-hard. Then, we present a polynomial time k-approximation algorithm for the CSP problem. Finally, we give a parametrized algorithm for the CSP (the parameter is the number of input strings) if the alphabet is binary and each string has the same number of 0's and 1's.