Looking at People: Sensing for Ubiquitous and Wearable Computing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computer vision for computer interaction
ACM SIGGRAPH Computer Graphics
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A joint particle filter for audio-visual speaker tracking
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Joint audio-visual tracking using particle filters
EURASIP Journal on Applied Signal Processing
Journal of Cognitive Neuroscience
Random projection trees and low dimensional manifolds
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Response time in man-computer conversational transactions
AFIPS '68 (Fall, part I) Proceedings of the December 9-11, 1968, fall joint computer conference, part I
Person Tracking with Audio-Visual Cues Using the Iterative Decoding Framework
AVSS '08 Proceedings of the 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance
Hotspot components for gesture-based interaction
INTERACT'05 Proceedings of the 2005 IFIP TC13 international conference on Human-Computer Interaction
A tutorial on particle filters for online nonlinear/non-GaussianBayesian tracking
IEEE Transactions on Signal Processing
WallBots: interactive wall-crawling robots in the hands of public artists and political activists
Proceedings of the 8th ACM Conference on Designing Interactive Systems
Hi-index | 0.00 |
We have built a system that engages naive users in an audio-visual interaction with a computer in an unconstrained public space. We combine audio source localization techniques with face detection algorithms to detect and track the user throughout a large lobby. The sensors we use are an ad-hoc microphone array and a PTZ camera. To engage the user, the PTZ camera turns and points at sounds made by people passing by. From this simple pointing of a camera, the user is made aware that the system has acknowledged their presence. To further engage the user, we develop a face classification method that identifies and then greets previously seen users. The user can interact with the system through a simple hot-spot based gesture interface. To make the user interactions with the system feel natural, we utilize reconfigurable hardware, achieving a visual response time of less than 100ms. We rely heavily on machine learning methods to make our system self-calibrating and adaptive.