Noise robustness in automatic speech recognition

Authors:
Chia-Ping Chen;Jeff Bilmes
Affiliations:
-;-
Venue:
Noise robustness in automatic speech recognition
Year:
2004

Citing 0
Cited 2

The Vocal Joystick Engine v1.0

Computer Speech and Language
The design and collection of COSINE, a multi-microphone in situ speech corpus recorded in noisy environments

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

The issue of noise robustness in automatic speech recognition is of practical importance and largely unsolved. In this thesis, this problem is tackled from both perspectives of front-end speech features and back-end speech models. For the front end, a feature processing technique consisting of mean subtraction, variance normalization and ARMA filtering is investigated. Mathematical analyses are carried out for the distortion of speech features in the presence of additive and convolutional noises. Extensive experiments are conducted to see how to best use this front-end technique. It is experimentally verified to be extremely effective for the noisy-digit databases of Aurora. This performance gain is achieved without increasing the model parameters and computational cost. For the back end, a novel random variable called a feature selector is introduced into speech models to dynamically select a robust component feature to score, ignoring the others. The values of the feature selectors are based on either the energy or the spectral entropy of the signal. This back-end technique does not lead to significant performance gain with the feature streams investigated in this work, MFCCs and post-processed MFCCs. Yet it is a novel scheme of integrating multiple information sources.