Sub-band feature statistics compensation techniques based on discrete wavelet transform for robust speech recognition

Authors:
Hao-Teng Fan;Jeih-weih Hung
Affiliations:
Dept of Electrical Engineering, National Chi Nan University, Taiwan, Republic of China;Dept of Electrical Engineering, National Chi Nan University, Taiwan, Republic of China
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 3
Cited 0

Digital Signal Processing: A Computer-Based Approach

Digital Signal Processing: A Computer-Based Approach
Optimization of temporal filters for constructing robust features in speech recognition

IEEE Transactions on Audio, Speech, and Language Processing
Quantile based histogram equalization for noise robust large vocabulary speech recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a novel scheme in performing feature statistics normalization techniques for robust speech recognition. In the proposed approach, the processed temporal-domain feature sequence is first decomposed into non-uniform sub-bands using discrete wavelet transform (DWT), and then each sub-band stream is individually processed by the well-known normalization methods, like mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the feature stream with all the modified sub-band streams using inverse DWT. With this process, the components that correspond to more important modulation spectral bands in the feature sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band MVN and HEQ provide relative error rate reductions of 20.18% and 19.65% over the conventional MVN and HEQ.