Audiovisual Arrays for Untethered Spoken Interfaces

  • Authors:
  • Kevin Wilson;Vibhav Rangarajan;Neal Checka;Trevor Darrell

  • Affiliations:
  • Massachusetts Institute of Technology;Massachusetts Institute of Technology;Massachusetts Institute of Technology;Massachusetts Institute of Technology

  • Venue:
  • ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

When faced with a distant speaker at a known location in a noisy environment, a microphonearray can provide a significantly improved audio signal for speech recognition. Estimating thelocation of a speaker in a reverberant environment from audio information alone can be quitedifficult, so we use an array of video cameras to aid localization. Stereo processing techniques are used on pairs of cameras, and foreground 3-D points are grouped to estimate the trajectory of people as they move in an environment. These trajectories are used to guide a microphone array beamformer. Initial results using this system for speech recognition demonstrate increased recognition rates compared to non-array processing techniques.