Multimodal speech separation

  • Authors:
  • Bertrand Rivet;Jonathon Chambers

  • Affiliations:
  • GIPSA-lab, CNRS UMR-5216, Grenoble INP, Grenoble, France;Electronic and Electrical Engineering, Loughborough University, UK

  • Venue:
  • NOLISP'09 Proceedings of the 2009 international conference on Advances in Nonlinear Speech Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The work of Bernstein and Benoît has confirmed that it is advantageous to use multiple senses, for example to employ both audio and visual modalities, in speech perception. As a consequence, looking at the speaker's face can be useful to better hear a speech signal in a noisy environment and to extract it from competing sources, as originally identified by Cherry, who posed the so-called “Cocktail Party” problem. To exploit the intrinsic coherence between audition and vision within a machine, the method of blind source separation (BSS) is particularly attractive.