A New Framework for Underdetermined Speech Extraction Using Mixture of Beamformers

Authors:
M. A. Dmour;M. Davies
Affiliations:
Inst. for Digital Commun. (IDCOM), Edinburgh Univ., Edinburgh, UK;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2011

Citing 0
Cited 1

Structured Sparsity Models for Reverberant Speech Separation

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes frequency-domain nonlinear mixture of beamformers that can extract a speech source from a known direction when there are fewer microphones than sources (the underdetermined case). Our approach models the data in each frequency bin via Gaussian mixture distributions, which can be learned using the expectation maximization algorithm. The model learning is performed using the observed mixture signals only, and no prior training is required. Nonlinear beamformers are then developed based on this model. The proposed estimators are a nonlinear weighted sum of linear minimum mean square error or minimum variance distortionless response beamformers. The resulting nonlinear beamformers do not need to know or estimate the number of sources, and can be applied to microphone arrays with two or more microphones. We test and evaluate the described methods on underdetermined speech mixtures.