Tracking head pose and focus of attention with multiple far-field cameras
Proceedings of the 8th international conference on Multimodal interfaces
Audio-visual perception of a lecturer in a smart seminar room
Signal Processing - Special section: Multimodal human-computer interfaces
Joint Bayesian Tracking of Head Location and Pose from Low-Resolution Video
Multimodal Technologies for Perception of Humans
Head Pose Estimation Based on Tensor Factorization
Neural Information Processing
A Neuro-fuzzy Approach to User Attention Recognition
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Evaluating multiple object tracking performance: the CLEAR MOT metrics
Journal on Image and Video Processing - Regular
Multimodal identity tracking in a smart room
Personal and Ubiquitous Computing
Multimedia Tools and Applications
A natural head pose and eye gaze dataset
Proceedings of the International Workshop on Affective-Aware Virtual Agents and Social Robots
Neural network-based head pose estimation and multi-view fusion
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Estimating human body and head orientation change to detect visual attention direction
ACCV'10 Proceedings of the 2010 international conference on Computer vision - Volume Part I
Estimating the lecturer's head pose in seminar scenarios – a multi-view approach
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Hi-index | 0.00 |
In the context of human-computer interaction, information about head pose is an important cue for building a statement about humans' focus of attention. In this paper, we present an approach to estimate horizontal head rotation of people inside a smart-room. This room is equipped with multiple cameras that aim to provide at least one facial view of the user at any location in the room. We use neural networks that were trained on samples of rotated heads in order to classify each camera view. Whenever there is more than one estimate of head rotation, we combine the different estimates into one joint hypothesis. We show experimentally, that by using the proposed combination scheme, the mean error for unknown users could be reduced by up to 50% when combining the estimates from multiple cameras.