A method to design standard HMMs with desired length distribution for biological sequence analysis

Authors:
Hongmei Zhu;Jiaxin Wang;Zehong Yang;Yixu Song
Affiliations:
Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China
Venue:
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Year:
2006

Citing 1
Cited 1

Two Methods for Improving Performance of a HMM and their Application for Gene Finding

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology

Learning Partially Observable Markov Models from First Passage Times

ECML '07 Proceedings of the 18th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Motivation: Hidden Markov Models (HMMs) have been widely used for biological sequence analysis. When modeling a phenomenon where for instance the nucleotide distribution does not change for various length of DNA, there are two popular approaches to achieve a desired length distribution: explicit or implicit modeling. The implicit modeling requires an elaborately designed model structure. So far there is no general procedure available for designing such a model structure from the training data automatically. Results: We present an iterative algorithm to design standard HMMs structure with length distribution from the training data. The basic idea behind this algorithm is to use multiple shifted negative binomial distributions to model empirical length distribution. The negative binomial distribution is obtained by an array of n states, each with the same transition probability to itself. We shift this negative binomial distribution by using a serial of states linearly connected before the binomial model.