An O(n log n) algorithm for finding all repetitions in a string
Journal of Algorithms
SIAM Journal on Computing
Optimal superprimitivity testing for strings
Information Processing Letters
An on-line string superprimitivity test
Information Processing Letters
Efficient detection of quasiperiodicities in strings
Theoretical Computer Science
Testing string superprimitivity in parallel
Information Processing Letters
A correction to “An optimal algorithm to compute all the covers of a string”
Information Processing Letters
The subtree max gap problem with application to parallel string covering
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Approximate periods of strings
Theoretical Computer Science
Efficient pattern-matching with don't cares
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Approximate string matching with gaps
Nordic Journal of Computing
Simple and Flexible Detection of Contiguous Repeats Using a Suffix Tree (Preliminary Version)
CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching
Finding Maximal Repetitions in a Word in Linear Time
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
String regularities with don't cares
Nordic Journal of Computing - Special issue: Selected papers of the Prague Stringology conference (PSC'02), September 23-24, 2002
Implementing approximate regularities extended abstract
ICCMSE '03 Proceedings of the international conference on Computational methods in sciences and engineering
Hi-index | 0.00 |
Computational methods on molecular sequence data are at the heart of computational molecular biology. Identification of known or unknown DNA and RNA motifs or regions involved in various biological processes such as initiation of transcription, gene expression and translation, or the discovery of various types of repeats are some of the applications of major concern. An accurate identification and localization of such elements will allow biologists to perform deeper studies of the structure, function and evolution of genomes. This requires the development of faster and more complex mathematical models and computer algorithms. In this work we discuss current techniques to cope with string problems in molecular sequence data. We focus on Weighted Sequences and Sequences with "don't care characters", explaining the open problems and their relevance to biological applications.