A study on visual focus of attention recognition from head pose in a meeting room

Authors:
Sileye O. Ba;Jean-Marc Odobez
Affiliations:
IDIAP Research Institute, Martigny, Switzerland;IDIAP Research Institute, Martigny, Switzerland
Venue:
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Year:
2006

Citing 7
Cited 10

A tutorial on hidden Markov models and selected applications in speech recognition

Readings in speech recognition
Comparative Study of Coarse Head Pose Estimation

MOTION '02 Proceedings of the Workshop on Motion and Video Computing
A probabilistic inference of multiparty-conversation structure based on Markov-switching models of gaze patterns, head directions, and utterances

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Using social geometry to manage interruptions and co-worker attention in office environments

GI '05 Proceedings of Graphics Interface 2005
Determining driver visual attention with one camera

IEEE Transactions on Intelligent Transportation Systems
Modeling individual and group actions in meetings with layered HMMs

IEEE Transactions on Multimedia
Modeling focus of attention for meeting indexing based on multiple cues

IEEE Transactions on Neural Networks

Graphical representation of meetings on mobile devices

Proceedings of the 10th international conference on Human computer interaction with mobile devices and services
A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization

ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Estimation of behavioral user state based on eye gaze and head pose--application in an e-learning environment

Multimedia Tools and Applications
Exploiting 'subjective' annotations

HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
Automatic nonverbal analysis of social interaction in small groups: A review

Image and Vision Computing
Unsupervised clustering in multimodal multiparty meeting analysis

Multimodal corpora
On the contextual analysis of agreement scores

Multimodal corpora
3D user-perspective, voxel-based estimation of visual focus of attention in dynamic meeting scenarios

International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Visual focus of attention recognition in the ambient kitchen

ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part III
Human behavior analysis in video surveillance: A Social Signal Processing perspective

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a study on the recognition of the visual focus of attention (VFOA) of meeting participants based on their head pose. Contrarily to previous studies on the topic, in our set-up, the potential VFOA of people is not restricted to other meeting participants only, but includes environmental targets (table, slide screen). This has two consequences. Firstly, this increases the number of possible ambiguities in identifying the VFOA from the head pose. Secondly, due to our particular set-up, the identification of the VFOA from head pose can not rely on an incomplete representation of the pose (the pan), but requests the knowledge of the full head pointing information (pan and tilt). In this paper, using a corpus of 8 meetings of 8 minutes on average, featuring 4 persons involved in the discussion of statements projected on a slide screen, we analyze the above issues by evaluating, through numerical performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device (the ground truth) or a vision based tracking system (head pose estimates). The results clearly show that in complex but realistic situations, it is quite optimistic to believe that the recognition of the VFOA can solely be based on the head pose, as some previous studies had suggested.