Independent component analysis and time-frequency masking for speech recognition in multitalker conditions

Authors:
Dorothea Kolossa;Ramon Fernandez Astudillo;Eugen Hoffmann;Reinhold Orglmeister
Affiliations:
Electronics and Medical Signal Processing Group, Berlin, Germany;Electronics and Medical Signal Processing Group, Berlin, Germany;Electronics and Medical Signal Processing Group, Berlin, Germany;Electronics and Medical Signal Processing Group, Berlin, Germany
Venue:
EURASIP Journal on Audio, Speech, and Music Processing
Year:
2010

Citing 6
Cited 4

Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications

Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications
Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures

EURASIP Journal on Applied Signal Processing
A batch algorithm for blind source separation of acoustic signals using ICA and time-frequency masking

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Transforming Binary Uncertainties for Robust Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing
Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking

IEEE Transactions on Audio, Speech, and Language Processing

Spectral histogram of oriented gradients (SHOGs) for Tamil language male/female speaker classification

International Journal of Speech Technology
Integration of beamforming and uncertainty-of-observation techniques for robust ASR in multi-source environments

Computer Speech and Language
Uncertainty-based learning of acoustic models from noisy data

Computer Speech and Language
Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

When a number of speakers are simultaneously active, for example in meetings or noisy public places, the sources of interest need to be separated from interfering speakers and from each other in order to be robustly recognized. Independent component analysis (ICA) has proven a valuable tool for this purpose. However, ICA outputs can still contain strong residual components of the interfering speakers whenever noise or reverberation is high. In such cases, nonlinear postprocessing can be applied to the ICA outputs, for the purpose of reducing remaining interferences. In order to improve robustness to the artefacts and loss of information caused by this process, recognition can be greatly enhanced by considering the processed speech feature vector as a random variable with time-varying uncertainty, rather than as deterministic. The aim of this paper is to show the potential to improve recognition of multiple overlapping speech signals through nonlinear postprocessing together with uncertainty-based decoding techniques.