Multichannel nonnegative matrix factorization in convolutive mixtures. With application to blind audio source separation

  • Authors:
  • Alexey Ozerov;Cedric Fevotte

  • Affiliations:
  • Institut TELECOM, TELECOM ParisTech, CNRS LTCI, 37-39, rue Dareau, 75014, France;CNRS LTCI, TELECOM ParisTech, 37-39, rue Dareau, 75014, France

  • Venue:
  • ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider inference in a general data-driven object-based model of multichannel audio data, assumed generated as a possibly under-determined convolutive mixture of source signals. Each source is given a model inspired from nonnegative matrix factorization (NMF) with the Itakura-Saito divergence, which underlies a statistical model of superimposed Gaussian components. We address estimation of the mixing and source parameters using two methods. The first one consists of maximizing the exact joint likelihood of the multichannel data using an expectation-maximization algorithm. The second method consists of maximizing the sum of individual likelihoods of all channels using a multiplicative update algorithm inspired from NMF methodology. Our decomposition algorithms were applied to stereo music and assessed in terms of blind source separation performance.