Structured Sparsity Models for Reverberant Speech Separation

Authors:
Afsaneh Asaei;Mohammad Golbabaee;Herve Bourlard;Volkan Cevher
Affiliations:
Idiap Research Institute,;Applied Mathematics Research Center (CEREMADE), Université Paris-Dauphine, France;Idiap Research Institute,;École Polytechnique Fédérale de Lausanne, Switzerland
Venue:
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Year:
2014

Citing 12
Cited 0

Blind Source Separation by Sparse Decomposition in a Signal Dictionary

Neural Computation
Probing the Pareto Frontier for Basis Pursuit Solutions

SIAM Journal on Scientific Computing
Convolutive underdetermined source separation through weighted interleaved ICA and spatio-temporal source correlation

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
The 2011 signal separation evaluation campaign (SiSEC2011): - audio source separation -

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
A class of frequency-domain adaptive approaches to blind multichannel identification

IEEE Transactions on Signal Processing
Harmonic decomposition of audio signals with matching pursuit

IEEE Transactions on Signal Processing
A least-squares approach to blind channel identification

IEEE Transactions on Signal Processing
Underdetermined Anechoic Blind Source Separation via -Basis-Pursuit With

IEEE Transactions on Signal Processing
Autoregressive Modeling of Temporal Envelopes

IEEE Transactions on Signal Processing
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
A New Framework for Underdetermined Speech Extraction Using Mixture of Beamformers

IEEE Transactions on Audio, Speech, and Language Processing
Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We tackle the speech separation problem through modeling the acoustics of the reverberant chambers. Our approach exploits structured sparsity models to perform speech recovery and room acoustic modeling from recordings of concurrent unknown sources. The speakers are assumed to lie on a two-dimensional plane and the multipath channel is characterized using the image model. We propose an algorithm for room geometry estimation relying on localization of the early images of the speakers by sparse approximation of the spatial spectrum of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings of spatially stationary sources demonstrate the effectiveness of the proposed approach for speech separation and recognition.