A likelihood measure based on projection-based group delay scheme for Mandarin speech recognition in noise

  • Authors:
  • Kuo-Chang Huang;Shin-Lun Tung;Yau-Tarng Juang

  • Affiliations:
  • Department of Electrical Engineering, National Central University, Chung-Li, 32054 Taiwan, ROC;Applied Research Laboratory, Telecommunication Laboratories, Chunghwa Telecom Co., Ltd., Chung-Li, 32054 Taiwan, ROC;Department of Electrical Engineering, National Central University, Chung-Li, 32054 Taiwan, ROC

  • Venue:
  • Signal Processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.08

Visualization

Abstract

This paper investigates a projection-based group delay scheme (PGDS) likelihood measure that significantly reduces noise contamination in speech recognition. Because the norm of the cepstral/GDS vector will be shrinked when the speech signals are corrupted by additive noise, the HMM parameters, namely, the mean vector and the covariance matrix, need to be furthermore modified. In this paper, the mean vector compensation, a covariance matrix adaptation function and state duration based upon the projection-based group delay scheme were incorporated with a semi-continuous HMM to improve the recognition rate in noisy environments. The proposed approach compensates the mean vector using a projection-based scale factor and the mean compensation bias, and fits the covariance matrix using a variance adaptive function. The bias and variance adaptive functions estimated from the training and/or testing data were used to balance the mismatch between different environments. Lastly, a state duration method was utilized to deal with the problem that the additive noise segments the error path in Viterbi decoding. Experiments declare that the PGDS presented herein can remarkably elevate the recognition performance in noisy environments.