Compensating the speech features via discrete cosine transform for robust speech recognition

Authors:
Hsin-Ju Hsieh;Wen-hsiang Tu;Jeih-weih Hung
Affiliations:
National Chi Nan University, Taiwan, Republic of China;National Chi Nan University, Taiwan, Republic of China;National Chi Nan University, Taiwan, Republic of China
Venue:
ROCLING '11 Proceedings of the 23rd Conference on Computational Linguistics and Speech Processing
Year:
2011

Citing 3
Cited 0

Cepstral domain segmental feature vector normalization for noise robust speech recognition

Speech Communication - Special issue on robust speech recognition
MVA Processing of Speech Features

IEEE Transactions on Audio, Speech, and Language Processing
Normalization of the Speech Modulation Spectra for Robust Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we develop a series of algorithms to improve the noise robustness of speech features based on discrete cosine transform (DCT). The DCT-based modulation spectra of clean speech feature streams in the training set are employed to generate two sequences representing the reference magnitudes and magnitude weights, respectively. The two sequences are then used to update the magnitude spectrum of each feature stream in the training and testing sets. The resulting new feature streams have shown robustness against the noise distortion. The experiments conducted on the Aurora-2 digit string database reveal that the proposed DCT-based approaches can provide relative error reduction rates of over 25% as compared with the baseline system using MVN-processed MFCC features. Experimental results also show that these new algorithms are well additive to many noise robustness methods to produce even higher recognition accuracy rates.