Toward a phylogenetically aware algorithm for fast DNA similarity search

  • Authors:
  • Jeremy Buhler;Rachel Nordgren

  • Affiliations:
  • Department of Computer Science and Engineering, Washington University, St. Louis, MO;Department of Computer Science and Engineering, Washington University, St. Louis, MO

  • Venue:
  • RCG'04 Proceedings of the 2004 RECOMB international conference on Comparative Genomics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-throughput DNA sequencing is now producing collections of genomes from moderately or closely related organisms. Such a collection may be represented as a multiple alignment M of orthologous sequences, which induces a phylogenetic tree τ. Long-range genomic alignments with phylogenies have not yet found a prominent place in BLAST-like similarity search algorithms, though using them directly as databases can potentially yield more accurate and more informative alignments. This work describes how to construct local alignments between a query and a multiple alignment in a way that explicitly uses a phylogenetic tree τ. We give an EM algorithm to find a locally optimal alignment when the location of the query on the tree τ is not known. An initial implementation of the method is tested on a large multiple alignment of sequences from eight vertebrate genomes.