A novel framework and training algorithm for variable-parameter hidden Markov models

Authors:
Dong Yu;Li Deng;Yifan Gong;Alex Acero
Affiliations:
Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2009

Citing 8
Cited 2

Speech recognition in noisy environments: a survey

Speech Communication
A Data-Driven Model Parameter Compensation Method for Noise-Robust Speech Recognition

IEICE - Transactions on Information and Systems
Word recognition in the car-speech enhancement/spectral transformations

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Large-margin minimum classification error training: A theoretical risk minimization perspective

Computer Speech and Language
Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor

IEEE Transactions on Audio, Speech, and Language Processing
A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing
Structured speech modeling

IEEE Transactions on Audio, Speech, and Language Processing
Discriminative cluster adaptive training

IEEE Transactions on Audio, Speech, and Language Processing

Using continuous features in the maximum entropy model

Pattern Recognition Letters
Handling signal variability with contextual markovian models

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new framework and the associated maximum-likelihood and discriminative training algorithms for the variable-parameter hidden Markov model (VPHMM) whose mean and variance parameters vary as functions of additional environment-dependent conditioning parameters. Our framework differs from the VPHMM proposed by Cui and Gong (2007) in that piecewise spline interpolation instead of global polynomial regression is used to represent the dependency of the HMM parameters on the conditioning parameters, and a more effective functional form is used to model the variances. Our framework unifies and extends the conventional discrete VPHMM. It no longer requires quantization in estimating the model parameters and can support both parameter sharing and instantaneous conditioning parameters naturally. We investigate the strengths and weaknesses of the model on the Aurora-3 corpus. We show that under the well-matched condition the proposed discriminatively trained VPHMM outperforms the conventional HMM trained in the same way with relative word error rate (WER) reduction of 19% and 15%, respectively, when only mean is updated and when both mean and variances are updated.