Complex extension of infinite sparse factor analysis for blind speech separation

Authors:
Kohei Nagira;Toru Takahashi;Tetsuya Ogata;Hiroshi G. Okuno
Affiliations:
Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan
Venue:
LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
Year:
2012

Citing 3
Cited 2

Infinite sparse factor analysis and infinite independent components analysis

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
First stereo audio source separation evaluation campaign: data, algorithms and results

ICA'07 Proceedings of the 7th international conference on Independent component analysis and signal separation
A blind source separation technique using second-order statistics

IEEE Transactions on Signal Processing

Infinite sparse factor analysis for blind source separation in reverberant environments

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Bayesian Nonparametrics for Microphone Array Processing

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method of blind source separation (BSS) for speech signals using a complex extension of infinite sparse factor analysis (ISFA) in the frequency domain. Our method is robust against delayed signals that usually occur in real environments, such as reflections, short-time reverberations, and time lags of signals arriving at microphones. ISFA is a conventional non-parametric Bayesian method of BSS, which has only been applied to time domain signals because it can only deal with real signals. Our method uses complex normal distributions to estimate source signals and mixing matrix. Experimental results indicate that our method outperforms the conventional ISFA in the average signal-to-distortion ratio (SDR).