Some string matching problems from bioinformatics which still need better solutions

  • Authors:
  • Gaston H. Gonnet

  • Affiliations:
  • ETH Informatik, 8092 Zürich, Switzerland

  • Venue:
  • Journal of Discrete Algorithms - SPIRE 2002
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bioinformatics, the discipline which studies the computational problems arising from molecular biology, poses many interesting problems to the string searching community. We will describe two problems arising from Bioinformatics, their preliminary solutions, and the more general problem that they pose. The first problem is searching for α-helices in protein sequences. This particular instance of the search is based on matching of hydrophobicity/hydrophilicity. We find an algorithm which is linear in the sequence length for fixed helix length and is O(n log n) for any helix length. The second problem is on matching probabilistic sequences against sequences or against other probabilistic sequences. In both cases we derive efficient formulas to compute scores according to a Markovian model of evolution.