Convolutive nonnegative matrix factorization with Markov random field smoothing for blind unmixing of multichannel speech recordings

Authors:
Rafal Zdunek
Affiliations:
Institute of Telecommunications, Teleinformatics and Acoustics, Wroclaw University of Technology, Wroclaw, Poland
Venue:
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Year:
2011

Citing 7
Cited 0

Springer Handbook of Speech Processing

Springer Handbook of Speech Processing
Blind Image Separation Using Nonnegative Matrix Factorization with Gibbs Smoothing

Neural Information Processing
Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis

Neural Computation
Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation

Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Performance measurement in blind audio source separation

IEEE Transactions on Audio, Speech, and Language Processing
Convolutive Speech Bases and Their Application to Supervised Speech Separation

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of blind unmixing of multichannel speech recordings in an underdetermined and convolutive case is discussed. A power spectrogram of each source is modeled by superposition of nonnegative rank-1 basic spectrograms, which leads to a Nonnegative Matrix Factorization (NMF) model for each source. Since the number of recording channels may be lower than the number of true sources (speakers), under-determinedness is possible. Hence, any meaningful a priori information about the source or the mixing operator can improve the results of blind separation. In our approach, we assume that the basic rank- 1 power spectrograms are locally smoothed both in frequency as well as time domains. To enforce the local smoothness, we incorporate the Markov Random Field (MRF) model in the form of the Gibbs prior to the complete data likelihood function. The simulations demonstrate that this approach considerably improves the separation results.