Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain

  • Authors:
  • Shoji Makino;Hiroshi Sawada;Ryo Mukai;Shoko Araki

  • Affiliations:
  • The authors are with NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619-0237 Japan. E-mail: maki@cslab.kecl.ntt.co.jp;The authors are with NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619-0237 Japan. E-mail: maki@cslab.kecl.ntt.co.jp;The authors are with NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619-0237 Japan. E-mail: maki@cslab.kecl.ntt.co.jp;The authors are with NTT Communication Science Laboratories, NTT Corporation, Kyoto-fu, 619-0237 Japan. E-mail: maki@cslab.kecl.ntt.co.jp

  • Venue:
  • IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper overviews a total solution for frequency-domain blind source separation (BSS) of convolutive mixtures of audio signals, especially speech. Frequency-domain BSS performs independent component analysis (ICA) in each frequency bin, and this is more efficient than time-domain BSS. We describe a sophisticated total solution for frequency-domain BSS, including permutation, scaling, circularity, and complex activation function solutions. Experimental results of 2 × 2, 3 × 3, 4 × 4, 6 × 8, and 2 × 2 (moving sources), (#sources × #microphones) in a room are promising.