Restructuring Gaussian mixture density functions in speaker-independent acoustic models

  • Authors:
  • Atsushi Nakamura

  • Affiliations:
  • NTT Communication Science Laboratories, D-202, 2-4 Hikaraidai Seika-Cho, Soraku-Gun, Kyoto 619-0237, Japan

  • Venue:
  • Speech Communication
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In continuous speech recognition featuring hidden Markov model (HMM), word N-gram, and time-synchronous beam search, a local modeling mismatch in the HMM will often cause the recognition performance to degrade. To cope with this problem, this paper proposes a method of restructuring Gaussian mixture output probability density functions (pdfs) in a pre-trained speaker-independent HMM set based on speech data. In this method, Gaussians are copied from other mixture pdfs, taking the distribution of local errors into account. This method leads to a restructuring of the mixture pdfs, where some Gaussians are shared by several states and the total number of Gaussians is not modified. Furthermore, the distribution of local errors is extracted by comparing the pre-trained HMM set and the speech data used in the pre-training, and thus new training data are not needed for this restructuring method. Experimental results prove that the proposed restructuring method can effectively restore local modeling mismatches and improve recognition performance.