Fast and accurate database homology search using upper bounds of local alignment scores

  • Authors:
  • Masumi Itoh;Susumu Goto;Tatsuya Akutsu;Minoru Kanehisa

  • Affiliations:
  • Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.85

Visualization

Abstract

Motivation: It is widely recognized that homology search and ortholog clustering are very useful for analyzing biological sequences. However, recent growth of sequence database size makes homolog detection difficult, and rapid and accurate methods are required. Results: We present a novel method for fast and accurate homology detection, assuming that the Smith--Waterman (SW) scores between all similar sequence pairs in a target database are computed and stored. In this method, SW alignment is computed only if the upper bound, which is derived from our novel inequality, is higher than the given threshold. In contrast to other methods such as FASTA and BLAST, this method is guaranteed to find all sequences whose scores against the query are higher than the specified threshold. Results of computational experiments suggest that the method is dozens of times faster than SSEARCH if genome sequence data of closely related species are available. Availability: The programs for fast homolog detection can be downloaded from ftp://ftp.kuicr.kyoto-u.ac.jp/itoh/ Contact: itoh@kuicr.kyoto-u.ac.jp