Speech recognition in noisy environments: a survey
Speech Communication
MMIE training of large vocabulary recognition systems
Speech Communication
On stochastic feature and model compensation approaches to robust speech recognition
Speech Communication - Special issue on robust speech recognition
Robustness in Automatic Speech Recognition: Fundamentals and Applications
Robustness in Automatic Speech Recognition: Fundamentals and Applications
Challenges in adopting speech recognition
Communications of the ACM - Multimodal interfaces that flex, adapt, and persist
Topics in Acoustic Echo and Noise Control: Selected Methods for the Cancellation of Acoustical Echoes, the Reduction of Background Noise, and Speech Processing (Signals and Communication Technology)
Speech enhancement based on a priori signal to noise estimation
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Speech enhancement by map spectral amplitude estimation using a super-Gaussian speech model
EURASIP Journal on Applied Signal Processing
Shape invariant time-scale and pitch modification of speech
IEEE Transactions on Signal Processing
An Ensemble Speaker and Speaking Environment Modeling Approach to Robust Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Quantile based histogram equalization for noise robust large vocabulary speech recognition
IEEE Transactions on Audio, Speech, and Language Processing
Large margin hidden Markov models for speech recognition
IEEE Transactions on Audio, Speech, and Language Processing
Approximate Test Risk Bound Minimization Through Soft Margin Estimation
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
The maximum a posteriori (MAP) criterion is popularly used for feature compensation (FC) and acoustic model adaptation (MA) to reduce the mismatch between training and testing data sets. MAP-based FC and MA require prior densities of mapping function parameters, and designing suitable prior densities plays an important role in obtaining satisfactory performance. In this paper, we propose to use an environment structuring framework to provide suitable prior densities for facilitating MAP-based FC and MA for robust speech recognition. The framework is constructed in a two-stage hierarchical tree structure using environment clustering and partitioning processes. The constructed framework is highly capable of characterizing local information about complex speaker and speaking acoustic conditions. The local information is utilized to specify hyper-parameters in prior densities, which are then used in MAP-based FC and MA to handle the mismatch issue. We evaluated the proposed framework on Aurora-2, a connected digit recognition task, and Aurora-4, a large vocabulary continuous speech recognition (LVCSR) task. On both tasks, experimental results showed that with the prepared environment structuring framework, we could obtain suitable prior densities for enhancing the performance of MAP-based FC and MA.