Optimization of the parameters characterizing sigmoidal rate-level functions based on acoustic features

Authors:
Víctor Poblete;Néstor Becerra Yoma;Richard M. Stern
Affiliations:
Speech Processing and Transmission Laboratory, Universidad de Chile, Av. Tupper 2007, P.O. Box 412-3, Santiago, Chile and Institute of Acoustics, Universidad Austral de Chile, Av. General Lagos 20 ...;Speech Processing and Transmission Laboratory, Universidad de Chile, Av. Tupper 2007, P.O. Box 412-3, Santiago, Chile;Department of Electrical and Computer Engineering and Language Technologies Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
Venue:
Speech Communication
Year:
2014

Citing 13
Cited 1

Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
The computational eye

IEEE Spectrum
Comparison of auditory models for robust speech recognition

HLT '91 Proceedings of the workshop on Speech and Natural Language
A computational auditory scene analysis system for speech segregation and robust speech recognition

Computer Speech and Language
An overview of text-independent speaker recognition: From features to supervectors

Speech Communication
Auditory nerve representation as a front-end for speech recognition in a noisy environment

Computer Speech and Language
Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram

Pattern Recognition
A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features

IEEE Transactions on Audio, Speech, and Language Processing
Robust Speaker Recognition in Noisy Conditions

IEEE Transactions on Audio, Speech, and Language Processing
On the Effects of Filterbank Design and Energy Computation on Robust Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing
An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions

IEEE Transactions on Audio, Speech, and Language Processing
Learning-Based Auditory Encoding for Robust Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Speech enhancement using generalized weighted β-order spectral amplitude estimator

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the development of an optimal sigmoidal rate-level function that is a component of many models of the peripheral auditory system. The optimization makes use of a set of criteria defined exclusively on the basis of physical attributes of the input sound that are inspired by physiological evidence. The criteria developed attempt to discriminate between a degraded speech signal and noise to preserve the maximum amount of information in the linear region of the sigmoidal curve, and to minimize the effects of distortion in the saturating regions. The performance of the proposed optimal sigmoidal function is validated by text-independent speaker-verification experiments with signals corrupted by additive noise at different SNRs. The experimental results suggest that the approach presented in combination with cepstral variance normalization can lead to relative reductions in equal error rate as great as 40% when compared with the use of baseline MFCC coefficients for some SNRs.