The complexity and approximability of finding maximum feasible subsystems of linear relations
Theoretical Computer Science
Randomized algorithms
Randomized approximation algorithms in combinatorial optimization
Approximation algorithms for NP-hard problems
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Distinguishing string selection problems
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Efficient approximation algorithms for the Hamming center problem
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Polynomial-Time Algorithms for Computing Characteristic Strings
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Banishing Bias from Consensus Sequences
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Optimal Solutions for the Closest-String Problem via Integer Programming
INFORMS Journal on Computing
An Exact Data Mining Method for Finding Center Strings and All Their Instances
IEEE Transactions on Knowledge and Data Engineering
Some results on approximating the minimax solution in approval voting
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
A parallel multistart algorithm for the closest string problem
Computers and Operations Research
An improved lower bound on approximation algorithms for the Closest Substring problem
Information Processing Letters
A data-based coding of candidate strings in the closest string problem
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
On the complexity of finding gapped motifs
Journal of Discrete Algorithms
A lower bound on approximation algorithms for the closest substring problem
COCOA'07 Proceedings of the 1st international conference on Combinatorial optimization and applications
Negative selection algorithms without generating detectors
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Swiftly computing center strings
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Discovering almost any hidden motif from multiple sequences
ACM Transactions on Algorithms (TALG)
Why large CLOSEST STRING instances are easy to solve in practice
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
On the hardness of counting and sampling center strings
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Journal of Discrete Algorithms
Probabilistic Analysis of a Motif Discovery Algorithm for Multiple Sequences
SIAM Journal on Discrete Mathematics
Exact algorithm and heuristic for the Closest String Problem
Computers and Operations Research
A GRASP algorithm for the Closest String Problem using a probability-based heuristic
Computers and Operations Research
The bounded search tree algorithm for the closest string problem has quadratic smoothed complexity
MFCS'11 Proceedings of the 36th international conference on Mathematical foundations of computer science
A three-string approach to the closest string problem
Journal of Computer and System Sciences
A new family of string classifiers based on local relatedness
DS'06 Proceedings of the 9th international conference on Discovery Science
Slightly superexponential parameterized problems
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
On the longest common rigid subsequence problem
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Parallel genetic algorithm and parallel simulated annealing algorithm for the closest string problem
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
An improved heuristic for the far from most strings problem
Journal of Heuristics
On the closest string via rank distance
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Enumerating neighbour and closest strings
IPEC'12 Proceedings of the 7th international conference on Parameterized and Exact Computation
On the Hardness of Counting and Sampling Center Strings
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Configurations and minority in the string consensus problem
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
An efficient two-phase ant colony optimization algorithm for the closest string problem
SEAL'12 Proceedings of the 9th international conference on Simulated Evolution and Learning
On approximating string selection problems with outliers
Theoretical Computer Science
Hi-index | 0.00 |
This paper presents a collection of string algorithms that are at the core of several biological problems such as discovering potential drug targets, creating diagnostic probes, universal primers or unbiased consensus sequences. All these problems reduce to the task of finding a pattern that, with some error, occurs in one set of strings (Closest Substring Problem) and does not occur in another set (Farthest String Problem). In this paper, we break down the problem into several subproblems and prove the following results. 1. The following are all NP-Hard: the Farthest String Problem, the Closest Substring Problem, and the Closest String Problem of finding a string that is close to each string in a set. 2. There is a PTAS for the Farthest String Problem based on a linear programming relaxation technique. 3. There is a polynomial-time (4/3 + ε)-approximation algorithm for the Closest String Problem for any small constant ε 0. Using this algorithm, we also provide an efficient heuristic algorithm for the Closest Substring Problem. 4. The problem of finding a string that is at least Hamming distance d from as many strings in a set as possible, cannot be approximated within nε in polynomial time for some fixed constant ε unless NP = P, where n is the number of strings in the set. 5. There is a polynomial-time 2-approximation for finding a string that is both the Closest Substring to one set, and the Farthest String from another set.