Blind separation of speech mixtures via time-frequency masking
IEEE Transactions on Signal Processing
IEEE Transactions on Audio, Speech, and Language Processing
A Uniform Framework for Ad-Hoc Indexes to Answer Reachability Queries on Large Graphs
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Under-determined reverberant audio source separation using a full-rank spatial covariance model
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on processing reverberant speech: methodologies and applications
Blind source separation based on time-frequency sparseness in the presence of spatial aliasing
LVA/ICA'10 Proceedings of the 9th international conference on Latent variable analysis and signal separation
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs
Computer Speech and Language
Hidden Markov Model on a unit hypersphere space for gesture trajectory recognition
Pattern Recognition Letters
Rule-based trajectory segmentation for modeling hand motion trajectory
Pattern Recognition
Bayesian Nonparametrics for Microphone Array Processing
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
In this paper, we propose a novel sparse source separation method that can estimate the number of sources and time-frequency masks simultaneously, even when the spatial aliasing problem exists. Recently, many sparse source separation approaches with time-frequency masks have been proposed. However, most of these approaches require information on the number of sources in advance. In our proposed method, we model the phase difference of arrival (PDOA) between microphones with a Gaussian mixture model (GMM) with a Dirichlet prior. Then we estimate the model parameters by using the maximum a posteriori (MAP) estimation based on the EM algorithm. In order to avoid one cluster being modeled by two or more Gaussians, we utilize a sparse distribution modeled by the Dirichlet distributions as the prior of the GMM mixture weight. Moreover, to handle wide microphone spacing cases where the spatial aliasing problem occurs, the indeterminacy of modulus 2***k in the phase is also included in our model. Experimental results show good performance of our proposed method.