Acoustic Echo and Noise Control: A Practical Approach
Acoustic Echo and Noise Control: A Practical Approach
ICASSP '95 Proceedings of the Acoustics, Speech, and Signal Processing, 1995. on International Conference - Volume 02
A vector Taylor series approach for environment-independent speech recognition
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Monaural speech separation and recognition challenge
Computer Speech and Language
Beamforming With a Maximum Negentropy Criterion
IEEE Transactions on Audio, Speech, and Language Processing
Performance measurement in blind audio source separation
IEEE Transactions on Audio, Speech, and Language Processing
Blind Spatial Subtraction Array for Speech Enhancement in Noisy Environment
IEEE Transactions on Audio, Speech, and Language Processing
Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources
IEEE Transactions on Audio, Speech, and Language Processing
Combining Speech Fragment Decoding and Adaptive Noise Floor Modeling
IEEE Transactions on Audio, Speech, and Language Processing
The PASCAL CHiME speech separation and recognition challenge
Computer Speech and Language
Hi-index | 0.00 |
This paper proposes and describes a complete system for Blind Source Extraction (BSE). The goal is to extract a target signal source in order to recognize spoken commands uttered in reverberant and noisy environments, and acquired by a microphone array. The architecture of the BSE system is based on multiple stages: (a) TDOA estimation, (b) mixing system identification for the target source, (c) on-line semi-blind source separation and (d) source extraction. All the stages are effectively combined, allowing the estimation of the target signal with limited distortion. While a generalization of the BSE framework is described, here the proposed system is evaluated on the data provided for the CHiME Pascal 2011 competition, i.e. binaural recordings made in a real-world domestic environment. The CHiME mixtures are processed with the BSE and the recovered target signal is fed to a recognizer, which uses noise robust features based on Gammatone Frequency Cepstral Coefficients. Moreover, acoustic model adaptation is applied to further reduce the mismatch between training and testing data and improve the overall performance. A detailed comparison between different models and algorithmic settings is reported, showing that the approach is promising and the resulting system gives a significant reduction of the error rate.