MoDEL: an efficient strategy for ungapped local multiple alignment

  • Authors:
  • David Hernandez;Robin Gras;Ron Appel

  • Affiliations:
  • Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH 1211 Geneva 4, Switzerland;Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH 1211 Geneva 4, Switzerland;Proteome Informatics Group, Swiss Institute of Bioinformatics, CMU, 1 rue Michel-Servet, CH 1211 Geneva 4, Switzerland

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a method for ungapped local multiple alignment (ULMA) in a given set of amino acid or nucleotide sequences. This method explores two search spaces using a linked optimization strategy. The first search space M consists of all possible words of a given length W, defined on the residue alphabet. An evolutionary algorithm searches this space globally. The second search space P consists of all possible ULMAs in the sequence set, each ULMA being represented by a position vector defining exactly one subsequence of length W per sequence. This search space is sampled with hill-climbing processes. The search of both spaces are coupled by projecting high scoring results from the global evolutionary search of M onto P. The hill-climbing processes then refine the optimization by local search, using the relative entropy between the ULMA and background residue frequencies as an objective function. We demonstrate some advantages of our strategy by analyzing difficult natural amino acid sequences and artificial datasets. A web interface is available at http://idefix.univ-rennes1.fr:8080/PatternDiscovery/