Characterizing RNA Secondary-Structure Features and Their Effects on Splice-Site Prediction

  • Authors:
  • Rezarta Islamaj Dogan;Lise Getoor;W. John Wilbur

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

composition and by their three-dimensional shape, called the secondary structure. The secondary structure of a pre-mRNA sequence may have a strong influence on gene splicing. In our previous work, we showed that a splice-site model employing sequence features built using our feature generation algorithm was very effective in predicting splice sites. The generated sequence features also contained biologically relevant features. In this paper, we extend the feature generation algorithm to construct secondary-structure features. These features capture the nucleotide pairing tendency in the splice-site neighborhood. We extend the splice-site model to include both pre-mRNA se- quence and structure characteristics. The new model significantly outperforms the sequence-based features model. The identified secondary-structure features capture biologically relevant signals such as splicing silencers. We also found these signals to prefer specific regions around the splice-site neighborhood and we detail their preference.