On sparse and low-rank matrix decomposition for singing voice separation

  • Authors:
  • Yi-Hsuan Yang

  • Affiliations:
  • Research Center for IT Innovation, Academia Sinica, Taipei, Taiwan Roc

  • Venue:
  • Proceedings of the 20th ACM international conference on Multimedia
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over recent years there has been a growing interest in finding ways to transform signals/matrices into sparse or low-rank representations, i.e., representations which are sparse in support or of low redundancy. Such decompositions are proving to be particularly powerful for a variety of signal processing and compression problems. In this paper, we investigate the application of this technique to the challenging task of singing voice/accompaniment separation for popular music. The vocal part is modeled as a sparse signal, whereas the instrumental part is considered to be low-rank. In addition, to better account for the particular properties of music, two new algorithms are proposed to improve the decomposition, including the incorporation of harmonicity priors and a back-end drum removal procedure. Evaluations on the MIR-1K benchmark dataset show that the proposed algorithms outperform the state-of-the-art by 0.01-2.41 db.