Analysis of the visual Lombard effect and automatic recognition experiments

  • Authors:
  • Panikos Heracleous;Carlos T. Ishi;Miki Sato;Hiroshi Ishiguro;Norihiro Hagita

  • Affiliations:
  • ATR, Intelligent Robotics and Communication Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto-fu 619-0288, Japan;ATR, Intelligent Robotics and Communication Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto-fu 619-0288, Japan;ATR, Intelligent Robotics and Communication Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto-fu 619-0288, Japan;ATR, Hiroshi Ishiguro Laboratory, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto-fu 619-0288, Japan;ATR, Intelligent Robotics and Communication Laboratories, 2-2-2 Hikaridai Seika-cho, Soraku-gun, Kyoto-fu 619-0288, Japan

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This study focuses on automatic visual speech recognition in the presence of noise. The authors show that, when speech is produced in noisy environments, articulatory changes occur because of the Lombard effect; these changes are both audible and visible. The authors analyze the visual Lombard effect and its role in automatic visual- and audiovisual speech recognition. Experimental results using both English and Japanese data demonstrate the negative effect of the Lombard effect in the visual speech domain. Without considering this factor in designing a lip-reading system, the performance of the system decreases. This is very important in audiovisual speech automatic recognition in real noisy environments. In such a case, however, the recognition rates decrease because of the presence of acoustic noise and because of the Lombard effect. The authors also show that the performance of an audiovisual speech recognizer depends also on the visual Lombard effect and can be further improved when it is considered in designing such a system.