The longest common subsequence problem for sequences with nested arc annotations

  • Authors:
  • Guohui Lin;Zhi-Zhong Chen;Tao Jiang;Jianjun Wen

  • Affiliations:
  • Department of Computing Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2E8;Department of Mathematical Sciences, Tokyo Denki University, Hatoyama, Saitama 350-0394, Japan;Department of Computer Science, University of California, Riverside, CA;Department of Computer Science, University of California, Riverside, CA

  • Venue:
  • Journal of Computer and System Sciences - Computational biology 2002
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Arc-annotated sequences are useful in representing the structural information of RNA and protein sequences. The LONGEST ARc-PRESERVING COMMON SUBSEQUENCE (LAPCS) Problem has been introduced in Evans (Algorithms and complexity for annotated sequence analysis, Ph.D. Thesis, University of Victoria, 1999) as a framework for studying the similarity of arc-annotated sequences. Several algorithmic and complexity results on the LAPCS problem have been presented in Evans (1999) and Jiang et al. (in: Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching (CPM 2000), Lecture Note in Computer Science, Vol. 1848, 2000, pp. 154-165). In this paper, we continue this line of research and present new algorithmic and complexity results on the LAPCS problem restricted to two nested arc-annotated sequences, denoted as LAPCS(NESTED, NESTED). The restricted problem is perhaps the most interesting variant of the LAPCS problem and has important applications in the comparison of RNA secondary structures. Particularly, we prove that LAPCS(NESTED, NESTED) is NP-hard, which answers an open question in Evans (1999). We then present a polynomial-time approximation scheme for LAPCS(NESTED, NESTED) with an additional c-diagonal restriction. An interesting special case, UNARY LAPCS(NESTED, NESTEO), is also investigated, for which we show the NP-hardness and present a better approximation algorithm than the one for general LAPCS(NESTED, NESTED).