Approximate matching of secondary structures

  • Authors:
  • Nadia El-Mabrouk;Mathieu Raffinot

  • Affiliations:
  • Université de Montréal, Montréal, Québec, Canada;CNRS, Equipe Génome et Informatique, Evry, France

  • Venue:
  • Proceedings of the sixth annual international conference on Computational biology
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

Several methods have been developed for identifying more or less complex RNA structures in a genome. Whatever the method is, it is always based on the search of conserved primary and secondary structures. While various efficient methods have been developed for searching motifs of the primary structure, usually represented as regular expressions, few effort has been expended in the efficient search of secondary structure signals. By a helix, we mean a structure defined by a combination of sequence and folding constraints. We present a flexible algorithm that searches for all approximate matches of a helix in a genome. Helices are represented by special regular expressions, that we call secondary expressions. The method is based on an alignment graph constructed from several copies of a pushdown automaton, arranged one on top of another. The worst time complexity is O(rpn), where n is the size of the genome, p the size of the secondary expression, and r its number of union symbols. We present our results of searching for specific signals of the tRNA and RNase P RNA in two genomes.