Data compression: methods and theory
Data compression: methods and theory
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Finding similar regions in many sequences
Journal of Computer and System Sciences - STOC 1999
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Finding Maximal Repetitions in a Word in Linear Time
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Finding approximate repetitions under Hamming distance
Theoretical Computer Science - Logic and complexity in computer science
Distinguishing string selection problems
Information and Computation
Bases of Motifs for Generating Repeated Patterns with Wild Cards
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Longest repeats with a block of k don't cares
Theoretical Computer Science
Hardness of optimal spaced seed design
Journal of Computer and System Sciences
Hi-index | 0.00 |
A gapped pattern is a sequence consisting of regular alphabet symbols and of joker symbols that match any alphabet symbol. The content of a gapped pattern is defined as the number of its non-joker symbols. A gapped motif is a gapped pattern that occurs repeatedly in a string or in a set of strings. The aim of this paper is to study the complexity of several gapped-motif-finding problems. The following three decision problems are shown NP-complete, even if the input alphabet is binary. (i) Given a string T and two integers c and q, decide whether or not there exists a gapped pattern with content c (or more) that occurs in T at q distinct positions (or more). (ii) Given a set of strings S and an integer c, decide whether or not there exists a gapped pattern with content c that occurs at least once in each string of S. (iii) Given m strings with the same length, and two integers c and q, decide whether or not there exists a gapped pattern with content c that matches at least q input strings. We also present a non-naive quadratic-time algorithm that solves the following optimization problem: given a string T and an integer q=1, compute a maximum-content gapped pattern Q such that q consecutive copies of Q occur in T.