Time-Frequency Analysis of Acoustic Transients
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 3 - Volume 3
Automatic speech recognition with an adaptation model motivated by auditory processing
IEEE Transactions on Audio, Speech, and Language Processing
Isolate Speech Recognition Based on Time-Frequency Analysis Methods
CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
A hierarchical framework for spectro-temporal feature extraction
Speech Communication
Hi-index | 0.00 |
The performances of two perceptual properties of the peripheral auditory system, synaptic adaptation and two-tone suppression, are compared for automatic speech recognition (ASR) in an additive noise environment. A simple method of synaptic adaptation as determined by psychoacoustic observations was implemented with temporal processing of speech utilizing a zero-crossing auditory model as a pre-processing front end. The concept is similar to RASTA processing, but instead of bandpass filters, a high-pass infinite impulse response (IIR) filter is used. It is shown that rapid synaptic adaptation may be implemented by temporal processing using the zero-crossing algorithm, not otherwise implementable in the spectral domain implementation. The two-tone suppression was implemented in the zero-crossing auditory model using a companding strategy. Recognition performances with the two perceptual features were evaluated on isolated digits (TIDIGITS) corpus using continuous density HMM recognizer in white, factory, babble and Volvo noise. It is observed that synaptic adaptation performs better in stationary white Gaussian noise. In presence of non-stationary non-Gaussian noise, however, no improvements or a degradation is observed. Moreover, a reciprocal effect is observed with two-tone suppression, with better performance in non-Gaussian real-world noise and degradation in stationary white Gaussian noise.