Detecting, tracking and interacting with people in a public space

Authors:
Sunsern Cheamanunkul;Evan Ettinger;Matt Jacobsen;Patrick Lai;Yoav Freund
Affiliations:
University of California, San Diego, La Jolla, CA, USA;University of California, San Diego, La Jolla, CA, USA;University of California, San Diego, La Jolla, CA, USA;Stanford University, Stanford, CA, USA;University of California, San Diego, La Jolla, CA, USA
Venue:
Proceedings of the 2009 international conference on Multimodal interfaces
Year:
2009

Citing 11
Cited 1

Looking at People: Sensing for Ubiquitous and Wearable Computing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer vision for computer interaction

ACM SIGGRAPH Computer Graphics
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A joint particle filter for audio-visual speaker tracking

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Joint audio-visual tracking using particle filters

EURASIP Journal on Applied Signal Processing
Eigenfaces for recognition

Journal of Cognitive Neuroscience
Random projection trees and low dimensional manifolds

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Response time in man-computer conversational transactions

AFIPS '68 (Fall, part I) Proceedings of the December 9-11, 1968, fall joint computer conference, part I
Person Tracking with Audio-Visual Cues Using the Iterative Decoding Framework

AVSS '08 Proceedings of the 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance
Hotspot components for gesture-based interaction

INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction
A tutorial on particle filters for online nonlinear/non-GaussianBayesian tracking

IEEE Transactions on Signal Processing

WallBots: interactive wall-crawling robots in the hands of public artists and political activists

Proceedings of the 8th ACM Conference on Designing Interactive Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have built a system that engages naive users in an audio-visual interaction with a computer in an unconstrained public space. We combine audio source localization techniques with face detection algorithms to detect and track the user throughout a large lobby. The sensors we use are an ad-hoc microphone array and a PTZ camera. To engage the user, the PTZ camera turns and points at sounds made by people passing by. From this simple pointing of a camera, the user is made aware that the system has acknowledged their presence. To further engage the user, we develop a face classification method that identifies and then greets previously seen users. The user can interact with the system through a simple hot-spot based gesture interface. To make the user interactions with the system feel natural, we utilize reconfigurable hardware, achieving a visual response time of less than 100ms. We rely heavily on machine learning methods to make our system self-calibrating and adaptive.