Robust Noisy Speech Recognition with Adaptive Frequency Bank Selection

  • Authors:
  • Dajin Lu

  • Affiliations:
  • -

  • Venue:
  • ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the development of automatic speech recognition technology, the robustness problem of speech recognition system is becoming more and more important. This paper addresses the problem of speech recognition in additive background noise environment. Since the frequency energy of different types of noise focuses on different frequency banks, the effect of additive noise on each frequency bank are different. The seriously obscured frequency banks have little word signal information left, and are harmful for subsequence speech processing. Wu et al.[1] applied the frequency bank selection theory to robust word boundary detection in noise environment, and obtained good detection results. In this paper, this theory is extended to noisy speech recognition. Unlike the standard MFCC which uses all frequency banks for cepstral coefficients, we only use the frequency banks that are slightest corruptedand discard the seriously obscured ones. Cepstral coefficients are calculated only on the selected frequency banks. Moreover, acoustic model is also adapted to match the modification of acoustic feature. Experiments on continuous digital speech recognition show that the proposed algorithm leads to better performance than spectral subtraction and cepstral mean normalization at low SNRs