Blind Separation and Deconvolution for Convolutive Mixture of Speech Combining SIMO-Model-Based ICA and Multichannel Inverse Filtering

  • Authors:
  • Hiroshi Saruwatari;Hiroaki Yamajo;Tomoya Takatani;Tsuyoki Nishikawa;Kiyohiro Shikano

  • Affiliations:
  • The authors are with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma-shi, 630-0192 Japan. E-mail: sawatari@is.naist.jp;The authors are with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma-shi, 630-0192 Japan. E-mail: sawatari@is.naist.jp;The authors are with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma-shi, 630-0192 Japan. E-mail: sawatari@is.naist.jp;The authors are with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma-shi, 630-0192 Japan. E-mail: sawatari@is.naist.jp;The authors are with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma-shi, 630-0192 Japan. E-mail: sawatari@is.naist.jp

  • Venue:
  • IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new two-stage blind separation and deconvolution strategy for multiple-input multiple-output (MIMO)-FIR systems driven by colored sound sources, in which single-input multiple-output (SIMO)-model-based ICA (SIMO-ICA) and blind multichannel inverse filtering are combined. SIMO-ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources as they are at the microphones. After the separation by the SIMO-ICA, a blind deconvolution technique for the SIMO model can be applied even when each source signal is temporally correlated and the mixing system has a nonminimum phase property. The simulation results reveal that the proposed algorithm can successfully achieve separation and deconvolution of a convolutive mixture of speech, and outperforms a number of conventional ICA-based BSD methods.