Overall risk criterion estimation of hidden Markov model parameters

Authors:
Janez Kaiser;Bogomir Horvat;Zdravko Kačič
Affiliations:
HERMES SoftLab, Gosposvetska 84, SI-2000 Maribor, Slovenia;FERI, Smetanova 17, SI-2000 Maribor, Slovenia and Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, SI-2000 Maribor, Slovenia;FERI, Smetanova 17, SI-2000 Maribor, Slovenia and Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, SI-2000 Maribor, Slovenia
Venue:
Speech Communication
Year:
2002

Citing 5
Cited 3

Hidden Markov models, maximum mutual information estimation, and the speech recognition problem

Hidden Markov models, maximum mutual information estimation, and the speech recognition problem
An improved approach to robust speech recognition using minimum error classification

Speech Communication
Hidden Markov Models for Speech Recognition

Hidden Markov Models for Speech Recognition
Automatic Speech Recognition: The Development of the Sphinx Recognition System

Automatic Speech Recognition: The Development of the Sphinx Recognition System
Lattice-based discriminative training for large vocabulary speech recognition

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02

Component-based discriminative classification for hidden Markov models

Pattern Recognition
Error approximation and minimum phone error acoustic model estimation

IEEE Transactions on Audio, Speech, and Language Processing
Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese text recognition

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel discriminative objective function for the estimation of hidden Markov model (HMM) parameters, based on the calculation of overall risk. For continuous speech recognition, the algorithm minimises the risk of misclassification on the training database and thus maximises recognition accuracy. The calculation of the risk is based on the measurement of Levenshtein distance between the correct transcription and the N-best recognised transcriptions, which counts the number of recognition errors - deleted, substituted and inserted words. The minimisation of the proposed criterion is implemented by using the extended Baum-Welch algorithm for the estimation of the discrete HMM parameters and the Normandin's extension of the algorithm for the estimation of the continuous densities. We tested the performance of the proposed algorithm on two tasks: phoneme recognition on the TIMIT database and continuous speech recognition on the Resource Management (RM) database. Results show consistent decrease of the recognition error rate when compared to the standard maximum likelihood estimation training. The highest achieved decrease of word error rate in our experiments is 20.8% on the TIMIT phoneme recognition task, 39.6% on the RM task with context-independent HMMs and 18.22% with context-dependent models.