Computing similarity between RNA structures

  • Authors:
  • Bin Ma;Lusheng Wang;Kaizhong Zhang

  • Affiliations:
  • Peking Univ., Beijing, People's Republic of China;City Univ. of Hong Kong, Kowloon;Univ. of Western Ontario, London, Ont., Canada

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2002

Quantified Score

Hi-index 5.23

Visualization

Abstract

The primary structure of a ribonucleic acid (RNA) molecule is a sequence of nucleotides (bases) over the four-letter alphabet {A,C,G,U}. The secondary or tertiary structure of an RNA is a set of base-pairs (nucleotide pairs) which forms bonds between AU and CG. For secondary structures, these bonds have been traditionally assumed to be one to one and non-crossing. This paper considers a notion of similarity between two RNA molecule structures taking into account the primary, the secondary and the tertiary structures. We show that, for tertiary structures, it is Max SNP-hard for both minimization and maximization versions. We show a stronger result for the maximization version where it cannot be approximated within ratio 2logn in polynomial time, unless NPDTIME[2polylogn]. We then present an algorithm that can be used for practical application. Our algorithm will produce an optimal solution for the case where at least one of the RNA involved is of a secondary structure. We also show an approximation algorithm.