Extension of Sparse, Adaptive Signal Decompositions to Semi-blind Audio Source Separation

  • Authors:
  • Andrew Nesbit;Emmanuel Vincent;Mark D. Plumbley

  • Affiliations:
  • School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom E1 4NS;METISS Group, IRISA-INRIA, Rennes Cedex, France 35042;School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom E1 4NS

  • Venue:
  • ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We apply sparse, fast and flexible adaptive lapped orthogonal transforms to underdetermined audio source separation using the time-frequency masking framework. This normally requires the sources to overlap as little as possible in the time-frequency plane. In this work, we apply our adaptive transform schemes to the semi-blind case, in which the mixing system is already known, but the sources are unknown. By assuming that exactly two sources are active at each time-frequency index, we determine both the adaptive transforms and the estimated source coefficients using ***1 norm minimisation. We show average performance of 12---13 dB SDR on speech and music mixtures, and show that the adaptive transform scheme offers improvements in the order of several tenths of a dB over transforms with constant block length. Comparison with previously studied upper bounds suggests that the potential for future improvements is significant.