A Constraint Based Structure Description Language for Biosequences

  • Authors:
  • Ingvar Eidhammer;Inge Jonassen;Svenn Helge Grindhaug;David Gilbert;Madu Ratnayake

  • Affiliations:
  • Department of Informatics, University of Bergen, HIB, N-5020 Bergen Norway ingvar@ii.uib.no;Department of Informatics, University of Bergen, HIB, N-5020 Bergen Norway inge@ii.uib.no;Department of Informatics, University of Bergen, HIB, N-5020 Bergen Norway svenn@ii.uib.no;Department of Computing, Northampton Square, London EC1V 0HB, United Kingdom drg@soi.city.ac.uk;Department of Computing, Northampton Square, London EC1V 0HB, United Kingdom

  • Venue:
  • Constraints
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report an investigation into how constraint solving techniques can be used to search for patterns in sequences (or strings) of symbols over a finite alphabet. We define a constraint-based structure description language for biosequences, and give the definition of an algorithm to solve the structure searching problem as a CSP. The methodology which we have developed is able to describe two-dimensional structure of biosequences, such as tandem repeats, stem loops, palindromes and pseudo-knots. We also report on an implementation of the language in the constraint logic programming language clp(FD), with test results of a simple searching algorithm, and results from a preliminary implementation in C++ using consistency checking techniques from solving CSP.