Beamforming initialization and data prewhitening in natural gradient convolutive blind source separation of speech mixtures

Authors:
Malay Gupta;Scott C. Douglas
Affiliations:
Department of Electrical Engineering, Southern Methodist University, Dallas, Texas;Department of Electrical Engineering, Southern Methodist University, Dallas, Texas
Venue:
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Year:
2007

Citing 4
Cited 2

Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures

EURASIP Journal on Applied Signal Processing
Neural networks for blind decorrelation of signals

IEEE Transactions on Signal Processing
Blind source separation based on a fast-convergence algorithm combining ICA and beamforming

IEEE Transactions on Audio, Speech, and Language Processing
Spatio–Temporal FastICA Algorithms for the Blind Separation of Convolutive Mixtures

IEEE Transactions on Audio, Speech, and Language Processing

Blind source separation for convolutive mixtures based on the joint diagonalization of power spectral density matrices

Signal Processing
Combining superdirective beamforming and frequency-domain blind source separation for highly reverberant signals

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on environmental sound synthesis, processing, and retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Successful speech enhancement by convolutive blind source separation (BSS) techniques requires careful design of all aspects of the chosen separation method. The conventional strategy for system initialization in both time- and frequency-domain BSS involves a diagonal center-spike FIR filter matrix and no data preprocessing; however, this strategy may not be the best for any chosen separation algorithm. In this paper, we experimentally evaluate two different approaches for potentially-improving the performance of time-domain and frequency-domain natural gradient speech separation algorithms - prewhitening of the signal mixtures, and delay-and-sum beamforming initialization for the separation system - to determine which of the two classes of algorithms benefit most from them. Our results indicate that frequency-domain-based natural gradient BSS methods generally need geometric information about the system to obtain any reasonable separation quality. For time-domain natural gradient separation algorithms, either beamforming initialization or prewhitening improves separation performance, particularly for larger-scale problems involving three or more sources and sensors.