Stereo Source Separation and Source Counting with MAP Estimation with Dirichlet Prior Considering Spatial Aliasing Problem

Authors:
Shoko Araki;Tomohiro Nakatani;Hiroshi Sawada;Shoji Makino
Affiliations:
NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan 619-0237;NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan 619-0237;NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan 619-0237;NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan 619-0237
Venue:
ICA '09 Proceedings of the 8th International Conference on Independent Component Analysis and Signal Separation
Year:
2009

Citing 3
Cited 9

Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors

Signal Processing
Blind separation of speech mixtures via time-frequency masking

IEEE Transactions on Signal Processing
Grouping Separated Frequency Components by Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation

IEEE Transactions on Audio, Speech, and Language Processing

A Uniform Framework for Ad-Hoc Indexes to Answer Reachability Queries on Large Graphs

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Under-determined reverberant audio source separation using a full-rank spatial covariance model

IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Blind source separation based on time-frequency sparseness in the presence of spatial aliasing

LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
Multi-source TDOA estimation in reverberant audio using angular spectra and clustering

Signal Processing
Convolutive underdetermined source separation through weighted interleaved ICA and spatio-temporal source correlation

LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs

Computer Speech and Language
Hidden Markov Model on a unit hypersphere space for gesture trajectory recognition

Pattern Recognition Letters
Rule-based trajectory segmentation for modeling hand motion trajectory

Pattern Recognition
Bayesian Nonparametrics for Microphone Array Processing

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel sparse source separation method that can estimate the number of sources and time-frequency masks simultaneously, even when the spatial aliasing problem exists. Recently, many sparse source separation approaches with time-frequency masks have been proposed. However, most of these approaches require information on the number of sources in advance. In our proposed method, we model the phase difference of arrival (PDOA) between microphones with a Gaussian mixture model (GMM) with a Dirichlet prior. Then we estimate the model parameters by using the maximum a posteriori (MAP) estimation based on the EM algorithm. In order to avoid one cluster being modeled by two or more Gaussians, we utilize a sparse distribution modeled by the Dirichlet distributions as the prior of the GMM mixture weight. Moreover, to handle wide microphone spacing cases where the spatial aliasing problem occurs, the indeterminacy of modulus 2***k in the phase is also included in our model. Experimental results show good performance of our proposed method.