Estimating focus of attention based on gaze and sound

Authors:
Rainer Stiefelhagen;Jie Yang;Alex Waibel
Affiliations:
University of Karlsruhe, Germany;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
Proceedings of the 2001 workshop on Perceptive user interfaces
Year:
2001

Citing 7
Cited 12

Teaching and learning as multimedia authoring: the classroom 2000 project

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Modeling focus of attention for meeting indexing

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Room with a Rear View: Meeting Capture in a Multimedia Conference Room

IEEE MultiMedia
Parametrized structure from motion for 3D adaptive feedback tracking of faces

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
A real-time face tracker

WACV '96 Proceedings of the 3rd IEEE Workshop on Applications of Computer Vision (WACV '96)
A model-based gaze tracking system

IJSIS '96 Proceedings of the 1996 IEEE International Joint Symposia on Intelligence and Systems

Evaluating look-to-talk: a gaze-aware interface in a collaborative environment

CHI '02 Extended Abstracts on Human Factors in Computing Systems
Head orientation and gaze direction in meetings

CHI '02 Extended Abstracts on Human Factors in Computing Systems
Face-Responsive Interfaces: From Direct Manipulation to Perceptive Presence

UbiComp '02 Proceedings of the 4th international conference on Ubiquitous Computing
AuraMirror: artistically visualizing attention

CHI '03 Extended Abstracts on Human Factors in Computing Systems
Tracking Focus of Attention in Meetings

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Attentional Object Spotting by Integrating Multimodal Input

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot

Proceedings of the 5th international conference on Multimodal interfaces
Auramirror: reflections on attention

Proceedings of the 2004 symposium on Eye tracking research & applications
Visual resonator: interface for interactive cocktail party phenomenon

CHI '06 Extended Abstracts on Human Factors in Computing Systems
Estimation of behavioral user state based on eye gaze and head pose--application in an e-learning environment

Multimedia Tools and Applications
Smart meeting systems: A survey of state-of-the-art and open issues

ACM Computing Surveys (CSUR)
Augmenting looking, pointing and reaching gestures to enhance the searching and browsing of physical objects

PERVASIVE'07 Proceedings of the 5th international conference on Pervasive computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Estimating a person's focus of attention is useful for various human-computer interaction applications, such as smart meeting rooms, where a user's goals and intent have to be monitored. In work presented here, we are interested in modeling focus of attention in a meeting situation. We have developed a system capable of estimating participants' focus of attention from multiple cues. We employ an omnidirectional camera to simultaneously track participants' faces around a meeting table and use neural networks to estimate their head poses. In addition, we use microphones to detect who is speaking. The system predicts participants' focus of attention from acoustic and visual information separately, and then combines the output of the audio- and video-based focus of attention predictors. We have evaluated the system using the data from three recorded meetings. The acoustic information has provided 8% error reduction on average compared to using a single modality.