A metric model of amino acid substitution

  • Authors:
  • Weijia Xu;Daniel P. Miranker

  • Affiliations:
  • Department of Computer Sciences, The Center for Computational Biology and Bioinformatics, University of Texas, Austin, TX 78712, USA;Department of Computer Sciences, The Center for Computational Biology and Bioinformatics, University of Texas, Austin, TX 78712, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: We address the question of whether there exists an effective evolutionary model of amino-acid substitution that forms a metric-distance function. There is always a trade-off between speed and sensitivity among competing computational methods of determining sequence homology. A metric model of evolution is a prerequisite for the development of an entire class of fast sequence analysis algorithms that are both scalable, O(log n) and sensitive. Results: We have reworked the mathematics of the point accepted mutation model (PAM) by calculating the expected time between accepted mutations in lieu of calculating log-odds probabilities. The resulting substitution matrix (mPAM) forms a metric. We validate the application of the mPAM evolutionary model for sequence homology by executing sequence queries from a controlled yeast protein homology search benchmark. We compare the accuracy of the results of mPAM and PAM similarity matrices as well as three prior metric models. The experiment shows that mPAM significantly outperforms the other three metrics and sufficiently approaches the sensitivity of PAM250 to make it applicable to the management of protein sequence databases.