Amino Acid Classification and Hash Seeds for Homology Search

  • Authors:
  • Weiming Li;Bin Ma;Kaizhong Zhang

  • Affiliations:
  • Department of Computer Science, University of Western Ontario, London, Canada N6A 5B7;School of Computer Science, University of Waterloo, Waterloo, Canada N2L 3G1;Department of Computer Science, University of Western Ontario, London, Canada N6A 5B7

  • Venue:
  • BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spaced seeds have been extensively studied in the homology search field. A spaced seed can be regarded as a very special type of hash function on k -mers, where two k -mers have the same hash value if and only if they are identical at the w (w k ) positions designated by the seed. Spaced seeds substantially increased the homology search sensitivity. It is then a natural question to ask whether there is a better hash function (called hash seed ) that provides better sensitivity than the spaced seed. We study this question in the paper. We propose a strategy to classify amino acids, which leads to a better hash seed. Our results raise a new question about how to design the best hash seed.