Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition

Authors:
Dong Yu;Li Deng
Affiliations:
Microsoft Research, USA;Microsoft Research, USA
Venue:
ICSC '07 Proceedings of the International Conference on Semantic Computing
Year:
2007

Citing 0
Cited 7

Large-margin minimum classification error training: A theoretical risk minimization perspective

Computer Speech and Language
Large margin training for hidden Markov models with partially observed states

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Training data selection for improving discriminative training of acoustic models

Pattern Recognition Letters
Introducing the Discriminative Paraconsistent Machine (DPM)

Information Sciences: an International Journal
Regularized bundle methods for convex and non-convex risks

The Journal of Machine Learning Research
Handling signal variability with contextual markovian models

Pattern Recognition Letters
Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese text recognition

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discriminative training has been a leading factor for improving automatic speech recognition (ASR) performance over the last decade. The traditional discriminative training, however, has been aimed to minimize empirical error rates on training sets, which may not be well generalized to test sets. Many attempts have been made recently to incorporate the principle of large margin (PLM) into the training of hidden Markov models (HMMs) in ASR to improve the generalization abilities. Significant error rate reduction on the test sets has been observed on both small vocabulary and large vocabulary continuous ASR tasks using large-margin discriminative training (LMDT) techniques. In this paper, we introduce the PLM, define the concept of margin in the HMMs, and survey a number of popular LMDT algorithms proposed and developed recently. Specifically, we review and compare the large-margin minimum classification error (LM-MCE) estimation, soft-margin estimation (SME), large margin estimation (LME), large relative margin estimation (LRME), and large margin training (LMT) with a focus on the insights, the training criteria, the optimization techniques used, and the strengths and weaknesses of these different approaches. We suggest future research directions in our conclusion of this paper.