Fast communication: Bernoulli versus Markov: Investigation of state transition regime in switching-state acoustic models

  • Authors:
  • Jahanshah Kabudian;Mohammad Mehdi Homayounpour;Seyed Mohammad Ahadi

  • Affiliations:
  • Department of Computer Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran;Department of Computer Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran;Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Hafez Avenue, Tehran 15914, Iran

  • Venue:
  • Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.08

Visualization

Abstract

In this paper, a new acoustic model called time-inhomogeneous hidden Bernoulli model (TI-HBM) is introduced as an alternative to hidden Markov model (HMM) in continuous speech recognition. Contrary to HMM, the state transition process in TI-HBM is not a Markov process, rather it is an independent (generalized Bernoulli) process. This difference leads to elimination of dynamic programming at state-level in TI-HBM decoding process. Thus, the computational complexity of TI-HBM for probability evaluation and state estimation is O(NL) (instead of O(N^2L) in the HMM case, where N and L are number of states and sequence length respectively). As a new framework for phone duration modeling, TI-HBM is able to model acoustic-unit duration (e.g. phone duration) by using a built-in parameter named survival probability. Similar to the HMM case, three essential problems in TI-HBM have been solved. An EM-algorithm-based method has been proposed for training TI-HBM parameters. Experiments in phone recognition for Persian (Farsi) spoken language show that the TI-HBM has some advantages over HMM (e.g. more simplicity and increased speed in recognition phase), and also outperforms HMM in terms of phone recognition accuracy.