Fast communication: Bernoulli versus Markov: Investigation of state transition regime in switching-state acoustic models

Authors:
Jahanshah Kabudian;Mohammad Mehdi Homayounpour;Seyed Mohammad Ahadi
Affiliations:
Department of Computer Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran;Department of Computer Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran;Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran
Venue:
Signal Processing
Year:
2009

Citing 4
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Probability and statistics with reliability, queuing and computer science applications

Probability and statistics with reliability, queuing and computer science applications
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
An MCMC sampling approach to estimation of nonstationary hiddenMarkov models

IEEE Transactions on Signal Processing

Quantified Score

Hi-index	0.08

Visualization

Abstract

In this paper, a new acoustic model called time-inhomogeneous hidden Bernoulli model (TI-HBM) is introduced as an alternative to hidden Markov model (HMM) in continuous speech recognition. Contrary to HMM, the state transition process in TI-HBM is not a Markov process, rather it is an independent (generalized Bernoulli) process. This difference leads to elimination of dynamic programming at state-level in TI-HBM decoding process. Thus, the computational complexity of TI-HBM for probability evaluation and state estimation is O(NL) (instead of O(N^2L) in the HMM case, where N and L are number of states and sequence length respectively). As a new framework for phone duration modeling, TI-HBM is able to model acoustic-unit duration (e.g. phone duration) by using a built-in parameter named survival probability. Similar to the HMM case, three essential problems in TI-HBM have been solved. An EM-algorithm-based method has been proposed for training TI-HBM parameters. Experiments in phone recognition for Persian (Farsi) spoken language show that the TI-HBM has some advantages over HMM (e.g. more simplicity and increased speed in recognition phase), and also outperforms HMM in terms of phone recognition accuracy.