A Seed-Based Method for Predicting Common Secondary Structures in Unaligned RNA Sequences

  • Authors:
  • Xiaoyong Fang;Zhigang Luo;Zhenghua Wang;Bo Yuan;Jinlong Shi

  • Affiliations:
  • National Laboratory for Parallel & Distributed Processing, National University of Defense Technology, 410073 Changsha, China;National Laboratory for Parallel & Distributed Processing, National University of Defense Technology, 410073 Changsha, China;National Laboratory for Parallel & Distributed Processing, National University of Defense Technology, 410073 Changsha, China;Department of Biomedical Informatics, College of Medicine and Public Health, Ohio State University, 43210-1239 Columbus Ohio, USA;National Laboratory for Parallel & Distributed Processing, National University of Defense Technology, 410073 Changsha, China

  • Venue:
  • MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The prediction of RNA secondary structure can be facilitated by incorporating with comparative analysis of homologous sequences. However, most of existing comparative approaches are vulnerable to alignment errors. Here we use unaligned sequences to devise a seed-based method for predicting RNA secondary structures. The central idea of our method can be described by three major steps: 1) to detect all possible stems in each sequence using the so-called position matrix, which indicates the paired or unpaired information for each position in the sequence; 2) to select the seeds for RNA folding by finding and assessing the conserved stems across all sequences; 3) to predict RNA secondary structures on the basis of the seeds. We tested our method on data sets composed of RNA sequences with known secondary structures. Our method has average accuracy (measured as sensitivity) 69.93% for singe sequence tests, 72.97% for two-sequence tests, and 79.27% for three-sequence tests. The results show that our method can predict RNA secondary structure with a higher accuracy than Mfold.