High performance connected digit recognition using codebook exponents

Authors:
Régis Cardin;Yves Normandin;Renato De Mori
Affiliations:
Centre de Recherche Informatique de Montréal, Montréal, Québec, Canada;Centre de Recherche Informatique de Montréal, Montréal, Québec, Canada;Centre de Recherche Informatique de Montréal, Montréal, Québec, Canada
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 4
Cited 3

Hidden Markov models, maximum mutual information estimation, and the speech recognition problem

Hidden Markov models, maximum mutual information estimation, and the speech recognition problem
Improvements in connected digit recognition using higher order spectral and energy features

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
High performance connected digit recognition using maximum mutual information estimation

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
An improved MMIE training algorithm for speaker-independent, small vocabulary, continuous speech recognition

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Improvements in connected digit recognition using linear discriminant analysis and mixture densities

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Inter-word coarticulation modeling and MMIE training for improved connected digit recognition

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
An algorithm for the dynamic inference of hidden Markov models (DIHMM)

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the latest developments by the speech research group at CRIM in speaker independent connected digit recognition, using Hidden Markov Models (HMMs) trained with Maximum Mutual Information Estimation (MMIE). The work presented here is a continuation of previous work described in [1]. The main differences are: 1) use of the 20 kHz TI/NIST corpus available on CD-ROM (instead of the 10 kHz distribution tape), 2) use of word models (instead of sub-word units), 3) addition of second derivative parameters, 4) a more elaborate training procedure for codebook exponents. The experiments described in this paper were all performed on the complete adult portion of the corpus. Our baseline system, using discrete HMMs and MMIE, has a 0.67% word error rate and a 2.03% string error rate. The paper describes techniques that allowed us to improve greatly the recognition rate. New results include a 0.41% word error rate and 1.25% string error rate with two models per digit (one for male and one for female speakers) using discrete HMMs.