Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
IEEE Spectrum
Comparison of auditory models for robust speech recognition
HLT '91 Proceedings of the workshop on Speech and Natural Language
A computational auditory scene analysis system for speech segregation and robust speech recognition
Computer Speech and Language
An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
Auditory nerve representation as a front-end for speech recognition in a noisy environment
Computer Speech and Language
A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification
IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features
IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition in Noisy Conditions
IEEE Transactions on Audio, Speech, and Language Processing
On the Effects of Filterbank Design and Energy Computation on Robust Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Learning-Based Auditory Encoding for Robust Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Speech enhancement using generalized weighted β-order spectral amplitude estimator
Speech Communication
Hi-index | 0.00 |
This paper describes the development of an optimal sigmoidal rate-level function that is a component of many models of the peripheral auditory system. The optimization makes use of a set of criteria defined exclusively on the basis of physical attributes of the input sound that are inspired by physiological evidence. The criteria developed attempt to discriminate between a degraded speech signal and noise to preserve the maximum amount of information in the linear region of the sigmoidal curve, and to minimize the effects of distortion in the saturating regions. The performance of the proposed optimal sigmoidal function is validated by text-independent speaker-verification experiments with signals corrupted by additive noise at different SNRs. The experimental results suggest that the approach presented in combination with cepstral variance normalization can lead to relative reductions in equal error rate as great as 40% when compared with the use of baseline MFCC coefficients for some SNRs.