Bayesian Monte Carlo estimation for profile hidden Markov models

Authors:
Steven J. Lewis;Alpan Raval;John E. Angus
Affiliations:
School of Mathematical Sciences, Claremont Graduate University, 710 N. College Avenue, Claremont, CA 91711, USA;School of Mathematical Sciences, Claremont Graduate University, 710 N. College Avenue, Claremont, CA 91711, USA and Keck Graduate Institute of Applied Life Sciences, 535 Watson Drive, Claremont, C ...;School of Mathematical Sciences, Claremont Graduate University, 710 N. College Avenue, Claremont, CA 91711, USA
Venue:
Mathematical and Computer Modelling: An International Journal
Year:
2008

Citing 1
Cited 2

Error bounds for convolutional codes and an asymptotically optimum decoding algorithm

IEEE Transactions on Information Theory

Recursive least squares parameter estimation for non-uniformly sampled systems based on the data filtering

Mathematical and Computer Modelling: An International Journal
A clustering approach for estimating parameters of a profile hidden Markov model

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.98

Visualization

Abstract

Hidden Markov models are used as tools for pattern recognition in a number of areas, ranging from speech processing to biological sequence analysis. Profile hidden Markov models represent a class of so-called ''left-right'' models that have an architecture that is specifically relevant to classification of proteins into structural families based on their amino acid sequences. Standard learning methods for such models employ a variety of heuristics applied to the expectation-maximization implementation of the maximum likelihood estimation procedure in order to find the global maximum of the likelihood function. Here, we compare maximum likelihood estimation to fully Bayesian estimation of parameters for profile hidden Markov models with a small number of parameters. We find that, relative to maximum likelihood methods, Bayesian methods assign higher scores to data sequences that are distantly related to the pattern consensus, show better performance in classifying these sequences correctly, and continue to perform robustly with regard to misspecification of the number of model parameters. Though our study is limited in scope, we expect our results to remain relevant for models with a large number of parameters and other types of left-right hidden Markov models.