A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Comparative Study of Coarse Head Pose Estimation
MOTION '02 Proceedings of the Workshop on Motion and Video Computing
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Using social geometry to manage interruptions and co-worker attention in office environments
GI '05 Proceedings of Graphics Interface 2005
Determining driver visual attention with one camera
IEEE Transactions on Intelligent Transportation Systems
Modeling individual and group actions in meetings with layered HMMs
IEEE Transactions on Multimedia
Modeling focus of attention for meeting indexing based on multiple cues
IEEE Transactions on Neural Networks
Graphical representation of meetings on mobile devices
Proceedings of the 10th international conference on Human computer interaction with mobile devices and services
ICMI '08 Proceedings of the 10th international conference on Multimodal interfaces
Multimedia Tools and Applications
Exploiting 'subjective' annotations
HumanJudge '08 Proceedings of the Workshop on Human Judgements in Computational Linguistics
Automatic nonverbal analysis of social interaction in small groups: A review
Image and Vision Computing
Unsupervised clustering in multimodal multiparty meeting analysis
Multimodal corpora
On the contextual analysis of agreement scores
Multimodal corpora
International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction
Visual focus of attention recognition in the ambient kitchen
ACCV'09 Proceedings of the 9th Asian conference on Computer Vision - Volume Part III
Hi-index | 0.00 |
This paper presents a study on the recognition of the visual focus of attention (VFOA) of meeting participants based on their head pose. Contrarily to previous studies on the topic, in our set-up, the potential VFOA of people is not restricted to other meeting participants only, but includes environmental targets (table, slide screen). This has two consequences. Firstly, this increases the number of possible ambiguities in identifying the VFOA from the head pose. Secondly, due to our particular set-up, the identification of the VFOA from head pose can not rely on an incomplete representation of the pose (the pan), but requests the knowledge of the full head pointing information (pan and tilt). In this paper, using a corpus of 8 meetings of 8 minutes on average, featuring 4 persons involved in the discussion of statements projected on a slide screen, we analyze the above issues by evaluating, through numerical performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device (the ground truth) or a vision based tracking system (head pose estimates). The results clearly show that in complex but realistic situations, it is quite optimistic to believe that the recognition of the VFOA can solely be based on the head pose, as some previous studies had suggested.