Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Recent Methods for RNA Modeling Using Stochastic Context-Free Grammars
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Tree Decomposition Based Fast Search of RNA Structures Including Pseudoknots in Genomes
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Fast detection of common sequence structure patterns in RNAs
Journal of Discrete Algorithms
A Memory Efficient Method for Structure-Based RNA Multiple Alignment
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Designing Filters for Fast-Known NcRNA Identification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
seed-based exclusion method for non-coding RNA gene search
COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics
Hi-index | 0.00 |
The discovery of novel non-coding RNAs has been among the most exciting recent developments in Biology. Yet, many more remain undiscovered. It has been hypothesized that there is in fact an abundance of functional non-coding RNA (ncRNA) with various catalytic and regulatory functions. Computational methods tailored specifically for ncRNA are being actively developed. As the inherent signal for ncRNA is weaker than that for protein coding genes, comparative methods offer the most promising approach, and are the subject of our research. We consider the following problem: Given an RNA sequence with a known secondary structure, efficiently compute all structural homologs (computed as a function of sequence and structural similarity) in a genomic database. Our approach, based on structural filters that eliminate a large portion of the database, while retaining the true homologs allows us to search a typical bacterial database in minutes on a standard PC, with high sensitivity and specificity. This is two orders of magnitude better than current available software for the problem.