Voice activity detection by lip shape tracking using EBGM

  • Authors:
  • Masaki Aoki;Ken Masuda;Hiroyoshi Matsuda;Tetsuya Takiguchi;Yasuo Ariki

  • Affiliations:
  • University of Kobe, Kobe, Japan;University of Kobe, Kobe, Japan;University of Kobe, Kobe, Japan;Kobe University, Kobe, Japan;Kobe University, Kobe, Japan

  • Venue:
  • Proceedings of the 15th international conference on Multimedia
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a voice activity detection of a target speaker (driver) in a car by integrating lip movement and acoustic processing. To prevent the wrong detection caused by nontarget speakers using only acoustic processing, the proposed system extracts the lip movement of the target speaker by measuring the lip aspect ratio. An infrared camera is used to cope with the change of lighting environment. In order to extract the lip from gray scale images, Elastic Bunch Graph Matching is employed. Experimental results showed the proposed system improved the precision rate in the voice activity detection by approximately 40% compared to the method using only acoustic processing in a car.