Fundamentals of statistical signal processing: estimation theory
Fundamentals of statistical signal processing: estimation theory
Neural Computation
Pattern Recognition Letters
Non-negative tensor factorization with applications to statistics and computer vision
ICML '05 Proceedings of the 22nd international conference on Machine learning
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
A flexible component model for precision ICA
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Supervised and semi-supervised separation of sounds from single-channel mixtures
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Complex nonconvex lp norm minimization for underdetermined source separation
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
First stereo audio source separation evaluation campaign: data, algorithms and results
ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Csiszár’s divergences for non-negative matrix factorization: family of new algorithms
ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
A robust method to count and locate audio sources in a stereophonic linear instantaneous mixture
ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Estimating the spatial position of spectral components in audio
ICA'06 Proceedings of the 6th international conference on Independent Component Analysis and Blind Signal Separation
Bayesian regularization and nonnegative deconvolution for room impulse response estimation
IEEE Transactions on Signal Processing
Blind separation of instantaneous mixtures of nonstationary sources
IEEE Transactions on Signal Processing
IEEE Transactions on Audio, Speech, and Language Processing
Convolutive Speech Bases and Their Application to Supervised Speech Separation
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Under-determined reverberant audio source separation using a full-rank spatial covariance model
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
A general modular framework for audio source separation
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
The 2010 signal separation evaluation campaign (SiSEC2010): audio source separation
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Informed source separation using latent components
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Learning the Morphological Diversity
SIAM Journal on Imaging Sciences
CMMR'10 Proceedings of the 7th international conference on Exploring music contents
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Informed source separation through spectrogram coding and data embedding
Signal Processing
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
A GMM sound source model for blind speech separation in under-determined conditions
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Hi-index | 0.00 |
We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly underdetermined convolutive mixture of source signals. We work in the short-time Fourier transform (STFT) domain, where convolution is routinely approximated as linear instantaneous mixing in each frequency band. Each source STFT is given a model inspired from nonnegative matrix factorization (NMF) with the Itakura-Saito divergence, which underlies a statistical model of superimposed Gaussian components. We address estimation of the mixing and source parameters using two methods. The first one consists of maximizing the exact joint likelihood of the multichannel data using an expectation-maximization (EM) algorithm. The second method consists of maximizing the sum of individual likelihoods of all channels using a multiplicative update algorithm inspired from NMF methodology. Our decomposition algorithms are applied to stereo audio source separation in various settings, covering blind and supervised separation, music and speech sources, synthetic instantaneous and convolutive mixtures, as well as professionally produced music recordings. Our EM method produces competitive results with respect to state-of-the-art as illustrated on two tasks from the international Signal Separation Evaluation Campaign (SiSEC 2008).