Programming Techniques: Regular expression search algorithm
Communications of the ACM
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Reporting Exact and Approximate Regular Expression Matches
CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching
Pattern Matching for Arc-Annotated Sequences
FST TCS '02 Proceedings of the 22nd Conference Kanpur on Foundations of Software Technology and Theoretical Computer Science
Exact pattern matching for RNA secondary structures
APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Exact matching of RNA secondary structure patterns
Theoretical Computer Science - Pattern discovery in the post genome
Pattern matching for arc-annotated sequences
ACM Transactions on Algorithms (TALG)
Faster pattern matching algorithm for arc-annotated sequences
Proceedings of the 2005 international conference on Federation over the Web
Hi-index | 0.01 |
Several methods have been developed for identifying more or less complex RNA structures in a genome. Whatever the method is, it is always based on the search of conserved primary and secondary structures. While various efficient methods have been developed for searching motifs of the primary structure, usually represented as regular expressions, few effort has been expended in the efficient search of secondary structure signals. By a helix, we mean a structure defined by a combination of sequence and folding constraints. We present a flexible algorithm that searches for all approximate matches of a helix in a genome. Helices are represented by special regular expressions, that we call secondary expressions. The method is based on an alignment graph constructed from several copies of a pushdown automaton, arranged one on top of another. The worst time complexity is O(rpn), where n is the size of the genome, p the size of the secondary expression, and r its number of union symbols. We present our results of searching for specific signals of the tRNA and RNase P RNA in two genomes.