Robust speech recognition by normalization of the acoustic space

Authors:
A. Acero;R. M. Stern
Affiliations:
Carnegie Mellon Univ., Pittsburgh, PA, USA;Carnegie Mellon Univ., Pittsburgh, PA, USA
Venue:
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Year:
1991

Citing 0
Cited 3

SNR-dependent compression of enhanced Mel sub-band energies for compensation of noise effects on MFCC features

Pattern Recognition Letters
Efficient joint compensation of speech for the effects of additive noise and linear filtering

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several algorithms are presented that increase the robustness of SPHINX, the CMU (Carnegie Mellon University) continuous-speech speaker-independent recognition systems, by normalizing the acoustic space via minimization of the overall VQ distortion. The authors propose an affine transformation of the cepstrum in which a matrix multiplication perform frequency normalization and a vector addition attempts environment normalization. The algorithms for environment normalization are efficient and improve the recognition accuracy when the system is tested on a microphone other than the one on which it was trained. The frequency normalization algorithm applies a different warping on the frequency axis to different speakers and it achieves a 10% decrease in error rate.