High performance connected digit recognition using codebook exponents

  • Authors:
  • Régis Cardin;Yves Normandin;Renato De Mori

  • Affiliations:
  • Centre de Recherche Informatique de Montréal, Montréal, Québec, Canada;Centre de Recherche Informatique de Montréal, Montréal, Québec, Canada;Centre de Recherche Informatique de Montréal, Montréal, Québec, Canada

  • Venue:
  • ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the latest developments by the speech research group at CRIM in speaker independent connected digit recognition, using Hidden Markov Models (HMMs) trained with Maximum Mutual Information Estimation (MMIE). The work presented here is a continuation of previous work described in [1]. The main differences are: 1) use of the 20 kHz TI/NIST corpus available on CD-ROM (instead of the 10 kHz distribution tape), 2) use of word models (instead of sub-word units), 3) addition of second derivative parameters, 4) a more elaborate training procedure for codebook exponents. The experiments described in this paper were all performed on the complete adult portion of the corpus. Our baseline system, using discrete HMMs and MMIE, has a 0.67% word error rate and a 2.03% string error rate. The paper describes techniques that allowed us to improve greatly the recognition rate. New results include a 0.41% word error rate and 1.25% string error rate with two models per digit (one for male and one for female speakers) using discrete HMMs.