Faster genome annotation of non-coding RNA families without loss of accuracy

Authors:
Zasha Weinberg;Walter L. Ruzzo
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Year:
2004

Citing 0
Cited 11

Tree Decomposition Based Fast Search of RNA Structures Including Pseudoknots in Genomes

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Searching Genomes for Noncoding RNA Using FastR

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The 2-Interval Pattern Matching Problems and Its Application to ncRNA Scanning

BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
RNA Search with Decision Trees and Partial Covariance Models

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A fast approximate covariance-model-based database search method for non-coding RNA

ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Exploring the ncRNA-ncRNA patterns based on bridging rules

Journal of Biomedical Informatics
Hardware-Accelerated RNA Secondary-Structure Alignment

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Searching for non-coding RNA

BSB'05 Proceedings of the 2005 Brazilian conference on Advances in Bioinformatics and Computational Biology
Structural alignment of pseudoknotted RNA

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Designing Filters for Fast-Known NcRNA Identification

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Accelerating ncRNA homology search with FPGAs

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are slow. This paper shows how to make CMs faster while provably sacrificing none of their accuracy. Specifically, based on the CM, our software builds a profile hidden Markov model (HMM), which filters the genome database. This HMM is a gorous filter i.e., its filtering eliminates only sequences that provably could not be annotated as homologs. The CM is run only on what remains. Optimizing the HMM for filtering involves minimizing an exponential objective function with linear inequality constraints. For most known ncRNA families, this allows an 8-gigabase database to be scanned in 2-20 days instead of years, and yields new family members missed by other techniques to improve CM speed.