Designing patterns for profile HMM search

  • Authors:
  • Yanni Sun;Jeremy Buhler

  • Affiliations:
  • Department of Computer Science and Engineering, Washington University St Louis, MO 63130, USA;Department of Computer Science and Engineering, Washington University St Louis, MO 63130, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Profile HMMs are a powerful tool for modeling conserved motifs in proteins. These models are widely used by search tools to classify new protein sequences into families based on domain architecture. However, the proliferation of known motifs and new proteomic sequence data poses a computational challenge for search, requiring days of CPU time to annotate an organism's proteome. Results: We use PROSITE-like patterns as a filter to speed up the comparison between protein sequence and profile HMM. A set of patterns is designed starting from the HMM, and only sequences matching one of these patterns are compared to the HMM by full dynamic programming. We give an algorithm to design patterns with maximal sensitivity subject to a bound on the false positive rate. Experiments show that our patterns typically retain at least 90% of the sensitivity of the source HMM while accelerating search by an order of magnitude. Availability: Contact the first author at the address below. Contact: yanni@cse.wustl.edu