Springer Handbook of Speech Processing
Springer Handbook of Speech Processing
Blind Image Separation Using Nonnegative Matrix Factorization with Gibbs Smoothing
Neural Information Processing
Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation
IEEE Transactions on Audio, Speech, and Language Processing
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Performance measurement in blind audio source separation
IEEE Transactions on Audio, Speech, and Language Processing
Convolutive Speech Bases and Their Application to Supervised Speech Separation
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
The problem of blind unmixing of multichannel speech recordings in an underdetermined and convolutive case is discussed. A power spectrogram of each source is modeled by superposition of nonnegative rank-1 basic spectrograms, which leads to a Nonnegative Matrix Factorization (NMF) model for each source. Since the number of recording channels may be lower than the number of true sources (speakers), under-determinedness is possible. Hence, any meaningful a priori information about the source or the mixing operator can improve the results of blind separation. In our approach, we assume that the basic rank- 1 power spectrograms are locally smoothed both in frequency as well as time domains. To enforce the local smoothness, we incorporate the Markov Random Field (MRF) model in the form of the Gibbs prior to the complete data likelihood function. The simulations demonstrate that this approach considerably improves the separation results.