Large-margin minimum classification error training: A theoretical risk minimization perspective
Computer Speech and Language
Large margin training for hidden Markov models with partially observed states
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Training data selection for improving discriminative training of acoustic models
Pattern Recognition Letters
Introducing the Discriminative Paraconsistent Machine (DPM)
Information Sciences: an International Journal
Regularized bundle methods for convex and non-convex risks
The Journal of Machine Learning Research
Handling signal variability with contextual markovian models
Pattern Recognition Letters
Hi-index | 0.00 |
Discriminative training has been a leading factor for improving automatic speech recognition (ASR) performance over the last decade. The traditional discriminative training, however, has been aimed to minimize empirical error rates on training sets, which may not be well generalized to test sets. Many attempts have been made recently to incorporate the principle of large margin (PLM) into the training of hidden Markov models (HMMs) in ASR to improve the generalization abilities. Significant error rate reduction on the test sets has been observed on both small vocabulary and large vocabulary continuous ASR tasks using large-margin discriminative training (LMDT) techniques. In this paper, we introduce the PLM, define the concept of margin in the HMMs, and survey a number of popular LMDT algorithms proposed and developed recently. Specifically, we review and compare the large-margin minimum classification error (LM-MCE) estimation, soft-margin estimation (SME), large margin estimation (LME), large relative margin estimation (LRME), and large margin training (LMT) with a focus on the insights, the training criteria, the optimization techniques used, and the strengths and weaknesses of these different approaches. We suggest future research directions in our conclusion of this paper.