Searching Genomes for Noncoding RNA Using FastR

  • Authors:
  • Shaojie Zhang;Brian Haas;Eleazar Eskin;Vineet Bafna

  • Affiliations:
  • -;-;-;-

  • Venue:
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The discovery of novel noncoding RNAs has been among the most exciting recent developments in biology. It has been hypothesized that there is, in fact, an abundance of functional noncoding RNAs (ncRNAs) with various catalytic and regulatory functions. However, the inherent signal for ncRNA is weaker than the signal for protein coding genes, making these harder to identify. We consider the following problem: Given an RNA sequence with a known secondary structure, efficiently detect all structural homologs in a genomic database by computing the sequence and structure similarity to the query. Our approach, based on structural filters that eliminate a large portion of the database while retaining the true homologs, allows us to search a typical bacterial genome in minutes on a standard PC. The results are two orders of magnitude better than the currently available software for the problem. We applied FastR to the discovery of novel riboswitches, which are a class of RNA domains found in the untranslated regions. They are of interest because they regulate metabolite synthesis by directly binding metabolites. We searched all available eubacterial and archaeal genomes for riboswitches from purine, lysine, thiamin, and riboflavin subfamilies. Our results point to a number of novel candidates for each of these subfamilies and include genomes that were not known to contain riboswitches.