Rapid homology search with two-stage extension and daughter seeds

  • Authors:
  • Miklós Csűrös;Bin Ma

  • Affiliations:
  • Department of Computer Science and Operations Research, Université de Montréal, Montréal, Qué., Canada;Department of Computer Science, University of Western Ontario, London, Ont, Canada

  • Venue:
  • COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Using a seed to rapidly “hit” possible homologies for further examination is a common practice to speed up homology search in molecular sequences. It has been shown that a collection of higher weight seeds have better sensitivity than a single lower weight seed at the same speed. However, huge memory requirements diminish the advantages of high weight seeds. This paper describes a two-stage extension method, which simulates high weight seeds with modest memory requirements. The paper also proposes the use of so-called daughter seeds, which is an extension of the previously studied vector seed idea. Daughter seeds, especially when combined with the two-stage extension, provide the flexibility to maximize the independence between the seeds, which is a well-known criterion for maximizing sensitivity. Some other practical techniques to reduce memory usage are also discussed in the paper.