Detecting, tracking and interacting with people in a public space

  • Authors:
  • Sunsern Cheamanunkul;Evan Ettinger;Matt Jacobsen;Patrick Lai;Yoav Freund

  • Affiliations:
  • University of California, San Diego, La Jolla, CA, USA;University of California, San Diego, La Jolla, CA, USA;University of California, San Diego, La Jolla, CA, USA;Stanford University, Stanford, CA, USA;University of California, San Diego, La Jolla, CA, USA

  • Venue:
  • Proceedings of the 2009 international conference on Multimodal interfaces
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have built a system that engages naive users in an audio-visual interaction with a computer in an unconstrained public space. We combine audio source localization techniques with face detection algorithms to detect and track the user throughout a large lobby. The sensors we use are an ad-hoc microphone array and a PTZ camera. To engage the user, the PTZ camera turns and points at sounds made by people passing by. From this simple pointing of a camera, the user is made aware that the system has acknowledged their presence. To further engage the user, we develop a face classification method that identifies and then greets previously seen users. The user can interact with the system through a simple hot-spot based gesture interface. To make the user interactions with the system feel natural, we utilize reconfigurable hardware, achieving a visual response time of less than 100ms. We rely heavily on machine learning methods to make our system self-calibrating and adaptive.