Speech Communication - Special issue on robust speech recognition
Connectionist Speech Recognition: A Hybrid Approach
Connectionist Speech Recognition: A Hybrid Approach
Speech Processing in the Auditory System
Speech Processing in the Auditory System
The auditory processing and recognition of speech
HLT '89 Proceedings of the workshop on Speech and Natural Language
Comparison of auditory models for robust speech recognition
HLT '91 Proceedings of the workshop on Speech and Natural Language
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Neural Computation
Robust speech feature extraction based on Gabor filtering and tensor factorization
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Speech fragment decoding techniques for simultaneous speaker identification and speech recognition
Computer Speech and Language
An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator
IEEE Transactions on Audio, Speech, and Language Processing
MVA Processing of Speech Features
IEEE Transactions on Audio, Speech, and Language Processing
Data Balancing for Efficient Training of Hybrid ANN/HMM Automatic Speech Recognition Systems
IEEE Transactions on Audio, Speech, and Language Processing
Speech Analysis in a Model of the Central Auditory System
IEEE Transactions on Audio, Speech, and Language Processing
Auditory representations of acoustic signals
IEEE Transactions on Information Theory - Part 2
Robust combination of neural networks and hidden Markov models for speech recognition
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Humans are quite adept at communicating in presence of noise. However most speech processing systems, like automatic speech and speaker recognition systems, suffer from a significant drop in performance when speech signals are corrupted with unseen background distortions. The proposed work explores the use of a biologically-motivated multi-resolution spectral analysis for speech representation. This approach focuses on the information-rich spectral attributes of speech and presents an intricate yet computationally-efficient analysis of the speech signal by careful choice of model parameters. Further, the approach takes advantage of an information-theoretic analysis of the message and speaker dominant regions in the speech signal, and defines feature representations to address two diverse tasks such as speech and speaker recognition. The proposed analysis surpasses the standard Mel-Frequency Cepstral Coefficients (MFCC), and its enhanced variants (via mean subtraction, variance normalization and time sequence filtering) and yields significant improvements over a state-of-the-art noise robust feature scheme, on both speech and speaker recognition tasks.