Tree Decomposition Based Fast Search of RNA Structures Including Pseudoknots in Genomes
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Searching Genomes for Noncoding RNA Using FastR
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The 2-Interval Pattern Matching Problems and Its Application to ncRNA Scanning
BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
RNA Search with Decision Trees and Partial Covariance Models
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A fast approximate covariance-model-based database search method for non-coding RNA
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Exploring the ncRNA-ncRNA patterns based on bridging rules
Journal of Biomedical Informatics
Hardware-Accelerated RNA Secondary-Structure Alignment
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
BSB'05 Proceedings of the 2005 Brazilian conference on Advances in Bioinformatics and Computational Biology
Structural alignment of pseudoknotted RNA
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Designing Filters for Fast-Known NcRNA Identification
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Accelerating ncRNA homology search with FPGAs
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Hi-index | 0.00 |
Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are slow. This paper shows how to make CMs faster while provably sacrificing none of their accuracy. Specifically, based on the CM, our software builds a profile hidden Markov model (HMM), which filters the genome database. This HMM is a gorous filter i.e., its filtering eliminates only sequences that provably could not be annotated as homologs. The CM is run only on what remains. Optimizing the HMM for filtering involves minimizing an exponential objective function with linear inequality constraints. For most known ncRNA families, this allows an 8-gigabase database to be scanned in 2-20 days instead of years, and yields new family members missed by other techniques to improve CM speed.