Modeling state durations in hidden Markov models for automatic speech recognition

Authors:
Padma Ramesh;Jay G. Wilpon
Affiliations:
Speech Research Department, AT&T Bell Laboratories, Murray Hill, NJ;Speech Research Department, AT&T Bell Laboratories, Murray Hill, NJ
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 1
Cited 5

Continuously variable duration hidden Markov models for automatic speech recognition

Computer Speech and Language

Hidden Markov model with duration side information for novel HMMD derivation, with application to eukaryotic gene finding

EURASIP Journal on Advances in Signal Processing - Special issue on genomic signal processing
Modelling lexical stress

TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Cascading discriminant and generative models for protein secondary structure prediction

PRIB'12 Proceedings of the 7th IAPR international conference on Pattern Recognition in Bioinformatics
Hierarchical multi-channel hidden semi Markov graphical models for activity recognition

Computer Vision and Image Understanding
A voice command system for AUTONOMY using a novel speech alignment algorithm

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hidden Markov modeling (HMM) techniques have been used successfully for connected speech recognition in the last several years. In the traditional HMM algorithms the probability of duration of a state decreases exponentially with time which is not appropriate for representing the temporal structure of speech. Non-parametric modeling of duration using semi-Markov chains does accomplish the task with a large increase in the computational complexity. Applying a post processing state duration penalty after Viterbi decoding adds very little computation but does not affect the forward recognition path. In this paper we present a way of modeling state durations in HMM using time dependent state transitions. This new inhomogeneous HMM (IHMM) does increase the computation by a small amount but reduces recognition error rates by 14-25%. Also, a suboptimal implementation of this scheme that requires no more computation than the traditional HMM is presented which also has reduced errors by 14-22% on a variety of databases.