2013 Special Issue: Nonlinear spectro-temporal features based on a cochlear model for automatic speech recognition in a noisy situation

Authors:
Yong-Sun Choi;Soo-Young Lee
Affiliations:
-;-
Venue:
Neural Networks
Year:
2013

Citing 6
Cited 0

Perceptual features for automatic speech recognition in noisy environments

Speech Communication
Letters: Binaural semi-blind dereverberation of noisy convoluted speech signals

Neurocomputing
A Bark-scale filter bank approach to independent component analysis for acoustic mixtures

Neurocomputing
Speech enhancement based conceptually on auditory evidence

IEEE Transactions on Signal Processing
Automatic speech recognition with an adaptation model motivated by auditory processing

IEEE Transactions on Audio, Speech, and Language Processing
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

A nonlinear speech feature extraction algorithm was developed by modeling human cochlear functions, and demonstrated as a noise-robust front-end for speech recognition systems. The algorithm was based on a model of the Organ of Corti in the human cochlea with such features as such as basilar membrane (BM), outer hair cells (OHCs), and inner hair cells (IHCs). Frequency-dependent nonlinear compression and amplification of OHCs were modeled by lateral inhibition to enhance spectral contrasts. In particular, the compression coefficients had frequency dependency based on the psychoacoustic evidence. Spectral subtraction and temporal adaptation were applied in the time-frame domain. With long-term and short-term adaptation characteristics, these factors remove stationary or slowly varying components and amplify the temporal changes such as onset or offset. The proposed features were evaluated with a noisy speech database and showed better performance than the baseline methods such as mel-frequency cepstral coefficients (MFCCs) and RASTA-PLP in unknown noisy conditions.