2D psychoacoustic filtering for robust speech recognition

Authors:
Peng Dai;Ing Yann Soon;Chai Kiat Yeo
Affiliations:
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore;School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore;School of Computer Engineering, Nanyang Technological University, Singapore
Venue:
ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
Year:
2009

Citing 3
Cited 3

Deconvolution of telephone line effects for speech recognition

Speech Communication
Should recognizers have ears?

Speech Communication - Special issue on robust speech recognition
Speech and Audio Signal Processing: Processing and Perception of Speech and Music

Speech and Audio Signal Processing: Processing and Perception of Speech and Music

A temporal warped 2D psychoacoustic modeling for robust speech recognition system

Speech Communication
A temporal frequency warped (TFW) 2D psychoacoustic filter for robust speech recognition system

Speech Communication
An improved model of masking effects for robust speech recognition system

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the weaknesses of speech recognition system is its lack of robustness to background noise as compared to human listeners under similarly conditions. This paper proposes a 2D psychoacoustic modeling algorithm which is integrated with a feature extraction front-end for hidden Markov model (HMM). The proposed algorithm incorporates the properties of human auditory system and applies it to the speech recognition system to enhance its robustness. It integrates forward masking, lateral inhibition and Cepstral Mean Normalization into ordinary mel-frequency cepstral coefficients (MFCC) feature extraction algorithm. Experiments carried out on AURORA2 database show that the word recognition rate can be improved significantly at low computational cost.