Models for Audiovisual Fusion in a Noisy-Vowel Recognition Task

  • Authors:
  • Pascal Teissier;Anne Guerin-Dugue;Jean-Luc Schwartz

  • Affiliations:
  • Laboratoire des Images et des Signaux LIS, INPG, 46 Av. Felix-Viallet, 38031 Grenoble Cedex 1/ Institut de la Communication Parlee CNRS UPRESA 5009/INPG-U. Stendhal ICP, INPG, 46 Av. Felix-Viall ...;Laboratoire des Images et des Signaux LIS, INPG, 46 Av. Felix-Viallet, 38031 Grenoble Cedex 1;Institut de la Communication Parlee CNRS UPRESA 5009/INPG-U. Stendhal ICP, INPG, 46 Av. Felix-Viallet, 38031 Grenoble Cedex 1

  • Venue:
  • Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a study of models for audiovisual (AV)fusion in a noisy-vowel recognition task. We progressively elaborateaudiovisual models in order to respect the major principledemonstrated by human subjects in speech perception experiments (the“synergy” principle): audiovisual identification should always bemore efficient than auditory-alone or visual-alone identification.We first recall that the efficiency of audiovisual speech recognitionsystems depends on the level at which they fuse sound and image: fourAV architectures are presented, and two are selected for thefollowing of the study. Secondly, we show the importance of providinga contextual input linked to the Signal-to-Noise Ratio (SNR) in thefusion process. Then we propose an original approach using anefficient nonlinear dimension reduction algorithm (“curvilinearcomponents analysis”) in order to increase the performances of thetwo AV architectures. Furthermore, we show that this approach allowsan easy and efficient estimation of the reliability of the audiosensor in relation to SNR, that this estimation can be used tocontrol the AV fusion process, and that it significantly improves theAV performances. Hence, altogether, nonlinear dimension reduction,context estimation and control of the fusion process enable us torespect the “synergy” criterion for the two most usedarchitectures.