Regularized nonnegative matrix factorization using Gaussian mixture priors for supervised single channel source separation

Authors:
Emad M. Grais;Hakan Erdogan
Affiliations:
Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli Tuzla, 34956 Istanbul, Turkey;Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli Tuzla, 34956 Istanbul, Turkey
Venue:
Computer Speech and Language
Year:
2013

Citing 9
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Comparison of approximate methods for handling hyperparameters

Neural Computation
GaP: a factor model for discrete data

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Mixtures of Gamma Priors for Non-negative Matrix Factorization Based Speech Separation

ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Enforcing harmonicity and smoothness in Bayesian non-negative matrix factorization applied to polyphonic music transcription

IEEE Transactions on Audio, Speech, and Language Processing
Conjugate gamma Markov random fields for modelling nonstationary sources

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a new regularized nonnegative matrix factorization (NMF) method for supervised single-channel source separation (SCSS). We propose a new multi-objective cost function which includes the conventional divergence term for the NMF together with a prior likelihood term. The first term measures the divergence between the observed data and the multiplication of basis and gains matrices. The novel second term encourages the log-normalized gain vectors of the NMF solution to increase their likelihood under a prior Gaussian mixture model (GMM) which is used to encourage the gains to follow certain patterns. In this model, the parameters to be estimated are the basis vectors, the gain vectors and the parameters of the GMM prior. We introduce two different ways to train the model parameters, sequential training and joint training. In sequential training, after finding the basis and gains matrices, the gains matrix is then used to train the prior GMM in a separate step. In joint training, within each NMF iteration the basis matrix, the gains matrix and the prior GMM parameters are updated jointly using the proposed regularized NMF. The normalization of the gains makes the prior models energy independent, which is an advantage as compared to earlier proposals. In addition, GMM is a much richer prior than the previously considered alternatives such as conjugate priors which may not represent the distribution of the gains in the best possible way. In the separation stage after observing the mixed signal, we use the proposed regularized cost function with a combined basis and the GMM priors for all sources that were learned from training data for each source. Only the gain vectors are estimated from the mixed data by minimizing the joint cost function. We introduce novel update rules that solve the optimization problem efficiently for the new regularized NMF problem. This optimization is challenging due to using energy normalization and GMM for prior modeling, which makes the problem highly nonlinear and non-convex. The experimental results show that the introduced methods improve the performance of single channel source separation for speech separation and speech-music separation with different NMF divergence functions. The experimental results also show that, using the GMM prior gives better separation results than using the conjugate prior.