A robust method to count and locate audio sources in a stereophonic linear instantaneous mixture

Authors:
Simon Arberet;Rémi Gribonval;Frédéric Bimbot
Affiliations:
IRISA, France;IRISA, France;IRISA, France
Venue:
ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Year:
2006

Citing 2
Cited 10

The enhanced LBG algorithm

Neural Networks
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing

K-hyperline clustering learning for sparse component analysis

Signal Processing
Blind Non-stationnary Sources Separation by Sparsity in a Linear Instantaneous Mixture

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
An ICA-Based Method for Blind Source Separation in Sparse Domains

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
A Uniform Framework for Ad-Hoc Indexes to Answer Reachability Queries on Large Graphs

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Blind Spectral-GMM Estimation for Underdetermined Instantaneous Audio Source Separation

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Underdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
First stereo audio source separation evaluation campaign: data, algorithms and results

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
A robust method to count and locate audio sources in a multichannel underdetermined mixture

IEEE Transactions on Signal Processing
Approximated Cramér-Rao bound for estimating the mixing matrix in the two-sensor noisy Sparse Component Analysis (SCA)

Digital Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a robust method to estimate the number of audio sources and the mixing matrix in a linear instantaneous mixture, even with more sources than sensors. Our method is based on a multiscale Short Time Fourier Transform (STFT), and relies on the assumption that in the neighborhood of some (unknown) scales and time-frequency points, only one source contributes to the mixture. Such time-frequency regions provide local estimates of the corresponding columns of the mixing matrix. Our main contribution is a new clustering algorithm called DEMIX to estimate the number of sources and the mixing matrix based on such local estimates. In contrast to DUET or other similar sparsity-based algorithms, which rely on a global scatter plot, our algorithm exploits a local confidence measure to weight the influence of each time-frequency point in the estimated matrix. Inspired by the work of Deville, the confidence measure relies on the time-frequency local persistence of the activity/inactivity of each source. Experiments are provided with stereophonic mixtures and show the improved performance of DEMIX compared to K-means or ELBG clustering algorithms.