Fast and accurate database homology search using upper bounds of local alignment scores

Authors:
Masumi Itoh;Susumu Goto;Tatsuya Akutsu;Minoru Kanehisa
Affiliations:
Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan;Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 3

Improved alignment of protein sequences based on common parts

ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
Improving the sensitivity and specificity of protein homology search by incorporating predicted secondary structures

ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Relevant and Non-Redundant Amino Acid Sequence Selection for Protein Functional Site Identification

International Journal of Software Science and Computational Intelligence

Quantified Score

Hi-index	3.85

Visualization

Abstract

Motivation: It is widely recognized that homology search and ortholog clustering are very useful for analyzing biological sequences. However, recent growth of sequence database size makes homolog detection difficult, and rapid and accurate methods are required. Results: We present a novel method for fast and accurate homology detection, assuming that the Smith--Waterman (SW) scores between all similar sequence pairs in a target database are computed and stored. In this method, SW alignment is computed only if the upper bound, which is derived from our novel inequality, is higher than the given threshold. In contrast to other methods such as FASTA and BLAST, this method is guaranteed to find all sequences whose scores against the query are higher than the specified threshold. Results of computational experiments suggest that the method is dozens of times faster than SSEARCH if genome sequence data of closely related species are available. Availability: The programs for fast homolog detection can be downloaded from ftp://ftp.kuicr.kyoto-u.ac.jp/itoh/ Contact: itoh@kuicr.kyoto-u.ac.jp