RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications
Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications
Compression and the Wheel of Fortune
DCC '03 Proceedings of the Conference on Data Compression
Bases of Motifs for Generating Repeated Patterns with Don''t Cares
Bases of Motifs for Generating Repeated Patterns with Don''t Cares
Bases of Motifs for Generating Repeated Patterns with Wild Cards
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Languages with mismatches and an application to approximate indexing
DLT'05 Proceedings of the 9th international conference on Developments in Language Theory
Note: Extracting string motif bases for quorum higher than two
Theoretical Computer Science
Hi-index | 0.00 |
In a sequence, approximate patterns are exponential in number. In this paper, we present a new notion of basis for the patterns with don't cares occurring in a given text (sequence). The primitive patterns are of interest since their number is lower than previous known definitions (and in a case, sub-linear in the size of the text), and these patterns can be used to extract all the patterns of a text. We present an incremental algorithm that computes the primitive patterns occurring at least q times in a text of length n, given the N primitive patterns occurring at least q-1 times, in time O(|Σ|Nn2 log2 n log log n). In the particular case where q = 2, the complexity in time is only O(|Σ|n2 log2 n log log n). We also give an algorithm that decides if a given pattern is primitive in a given text.