Fusion of Audio-Visual Information for Integrated Speech Processing

  • Authors:
  • Satoshi Nakamura

  • Affiliations:
  • -

  • Venue:
  • AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the integration of audio and visual speech information for robust adaptive speech processing. Since both audio speech signals and visual face configurations are produced by the human speech organs, these two types of information are strongly correlated and sometimes complement each other. This paper describes two applications based on the relationship between the two types of information, that is, bimodal speech recognition robust to acoustic noise that integrates audio-visual information, and speaking face synthesis based on the correlation between audio and visual speech.