Local alignment of RNA sequences with arbitrary scoring schemes

  • Authors:
  • Rolf Backofen;Danny Hermelin;Gad M. Landau;Oren Weimann

  • Affiliations:
  • Institute of Computer Science, Albert-Ludwigs Universität Freiburg, Freiburg, Germany;Department of Computer Science, University of Haifa, Haifa, Israel;Department of Computer Science, University of Haifa, Haifa, Israel;Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA

  • Venue:
  • CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Local similarity is an important tool in comparative analysis of biological sequences, and is therefore well studied. In particular, the Smith-Waterman technique and its normalized version are two established metrics for measuring local similarity in strings. In RNA sequences however, where one must consider not only sequential but also structural features of the inspected molecules, the concept of local similarity becomes more complicated. First, even in global similarity, computing global sequence-structure alignments is more difficult than computing standard sequence alignments due to the bi-dimensionality of information. Second, one can view locality in two different ways, in the sequential or structural sense, leading to different problem formulations. In this paper we introduce two sequentially-local similarity metrics for comparing RNA sequences. These metrics combine the global RNA alignment metric of Shasha and Zhang [16] with the Smith-Waterman metric [17] and its normalized version [2] used in strings. We generalize the familiar alignment graph used in string comparison to apply also for RNA sequences, and then utilize this generalization to devise two algorithms for computing local similarity according to our two suggested metrics. Our algorithms run in $\mathcal{O}(m^2 n \lg n)$ and $\mathcal{O}(m^2 n \lg n+n^2m)$ time respectively, where m ≤n are the lengths of the two given RNAs. Both algorithms can work with any arbitrary scoring scheme.