Perceptually enhanced blind single-channel music source separation by Non-negative Matrix Factorization

Authors:
S. KıRbıZ;B. GüNsel
Affiliations:
Multimedia Signal Processing and Pattern Recognition Lab., Istanbul Technical University, Department of Electronics and Communications Engineering, 34469 Maslak, Istanbul, Turkey;Multimedia Signal Processing and Pattern Recognition Lab., Istanbul Technical University, Department of Electronics and Communications Engineering, 34469 Maslak, Istanbul, Turkey
Venue:
Digital Signal Processing
Year:
2013

Citing 10
Cited 0

Nonnegative matrix factorization with Gaussian process priors

Computational Intelligence and Neuroscience - Advances in Nonnegative Matrix and Tensor Factorization
Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Bayesian Inference for Nonnegative Matrix Factor Deconvolution Models

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition
Notes on nonnegative tensor factorization of the spectrogram for audio source separation: statistical insights and towards self-clustering of the spatial cues

CMMR'10 Proceedings of the 7th international conference on Exploring music contents
Algorithms for nonnegative matrix factorization with the β-divergence

Neural Computation
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria

IEEE Transactions on Audio, Speech, and Language Processing
Audio source separation with a single sensor

IEEE Transactions on Audio, Speech, and Language Processing
Subjective and Objective Quality Assessment of Audio Source Separation

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new approach that improves perceptual quality of the separated sources in blind single-channel musical source separation. It uses the advantages of subspace learning based on Non-negative Matrix Factorization (NMF) in which the bases represent the notes. The cost function is formulated in the form of weighted @b-divergence by adopting the PEAQ auditory model defined in ITU-R BS.1387 into the source separation. The proposed perceptually weighted factorization scheme is integrated into the Non-negative Matrix Factor 2-D Deconvolution (NMF2D) and Clustered Non-negative Matrix Factorization (CNMF) to overcome the source clustering problem encountered in under-determined source separation. It is shown that the introduced perceptually weighted NMF schemes, named as PW-NMF2D and PW-CNMF, efficiently learn the bases that enable us to apply a simple resynthesis of the musical sources based on the temporal model stored in the encoding matrix. Source separation performance has been reported on musical mixtures where 1-2 dB improvement is achieved in terms of SDR, SIR and SAR compared to the state-of-the-art methods. Performance has also been evaluated by perceptual measures resulting an improvement of 2-5 in OPS, TPS, IPS and APS values.