Multiple alignment, communication cost, and graph matching
SIAM Journal on Applied Mathematics
Randomized algorithms
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
On the closest string and substring problems
Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Approximation Algorithms for Multiple Sequence Alignment
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Multiple Sequence Alignment as a Facility-Location Problem
INFORMS Journal on Computing
An upper bound on the hardness of exact matrix based motif discovery
Journal of Discrete Algorithms
DNA Motif Representation with Nucleotide Dependency
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
An efficient motif discovery algorithm with unknown motif length and number of binding sites
International Journal of Data Mining and Bioinformatics
On the Structure of Small Motif Recognition Instances
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
On the approximability of the Maximum Agreement SubTree and Maximum Compatible Tree problems
Discrete Applied Mathematics
Detecting Motifs in a Large Data Set: Applying Probabilistic Insights to Motif Finding
BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Faster Algorithms for Sampling and Counting Biological Sequences
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
On the complexity of finding gapped motifs
Journal of Discrete Algorithms
Segmentation and annotation of audiovisual recordings based on automated speech recognition
IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Challenges rising from learning motif evaluation functions using genetic programming
Proceedings of the 12th annual conference on Genetic and evolutionary computation
A Cluster Refinement Algorithm for Motif Discovery
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Why large CLOSEST STRING instances are easy to solve in practice
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
On the hardness of counting and sampling center strings
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Anonymizing binary and small tables is hard to approximate
Journal of Combinatorial Optimization
The bounded search tree algorithm for the closest string problem has quadratic smoothed complexity
MFCS'11 Proceedings of the 36th international conference on Mathematical foundations of computer science
A cost-aggregating integer linear program for motif finding
Journal of Discrete Algorithms
New bounds for motif finding in strong instances
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
A compact mathematical programming formulation for DNA motif finding
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Sharper upper and lower bounds for an approximation scheme for consensus-pattern
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
On the longest common rigid subsequence problem
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
An upper bound on the hardness of exact matrix based motif discovery
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Randomized algorithms for motif detection
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Efficient algorithm for mining correlated Protein-DNA binding cores
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Trie-based apriori motif discovery approach
ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
On the closest string via rank distance
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
On approximating string selection problems with outliers
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Identification of distinguishing motifs
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
On the Hardness of Counting and Sampling Center Strings
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
On approximating string selection problems with outliers
Theoretical Computer Science
Hi-index | 0.00 |
Algorithms for finding similar, or highly conserved, regions in a group of sequences are at the core of many molecular biology problems. Assume that we are given n DNA sequences s1, ...., sn. The Consensus Patterns problem, which has been widely studied in bioinformatics research, in its simplest form, asks for a region of length L in each si, and a median string s of length L so that the total Hamming distance from s to these regions is minimized. We show that the problem is NP-hard and give a polynomial time approximation scheme (PTAS) for it. We then present an efficient approximation algorithm for the consensus pattern problem under the original relative entropy measure. As an interesting application of our analysis, we further obtain a PTAS for a restricted (but still NP-hard) version of the important consensus alignment problem allowing at most constant number of gaps, each of arbitrary length, in each sequence.