Multi-View Head Pose Estimation using Neural Networks

Authors:
Michael Voit;Kai Nickel;Rainer Stiefelhagen
Affiliations:
Universität Karlsruhe (TH), Germany;Universität Karlsruhe (TH), Germany;Universität Karlsruhe (TH), Germany
Venue:
CRV '05 Proceedings of the 2nd Canadian conference on Computer and Robot Vision
Year:
2005

Citing 0
Cited 12

Tracking head pose and focus of attention with multiple far-field cameras

Proceedings of the 8th international conference on Multimodal interfaces
Audio-visual perception of a lecturer in a smart seminar room

Signal Processing - Special section: Multimodal human-computer interfaces
Joint Bayesian Tracking of Head Location and Pose from Low-Resolution Video

Multimodal Technologies for Perception of Humans
Head Pose Estimation Based on Tensor Factorization

Neural Information Processing
A Neuro-fuzzy Approach to User Attention Recognition

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Evaluating multiple object tracking performance: the CLEAR MOT metrics

Journal on Image and Video Processing - Regular
Multimodal identity tracking in a smart room

Personal and Ubiquitous Computing
Estimation of behavioral user state based on eye gaze and head pose--application in an e-learning environment

Multimedia Tools and Applications
A natural head pose and eye gaze dataset

Proceedings of the International Workshop on Affective-Aware Virtual Agents and Social Robots
Neural network-based head pose estimation and multi-view fusion

CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Estimating human body and head orientation change to detect visual attention direction

ACCV'10 Proceedings of the 2010 international conference on Computer vision - Volume Part I
Estimating the lecturer's head pose in seminar scenarios – a multi-view approach

MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the context of human-computer interaction, information about head pose is an important cue for building a statement about humans' focus of attention. In this paper, we present an approach to estimate horizontal head rotation of people inside a smart-room. This room is equipped with multiple cameras that aim to provide at least one facial view of the user at any location in the room. We use neural networks that were trained on samples of rotated heads in order to classify each camera view. Whenever there is more than one estimate of head rotation, we combine the different estimates into one joint hypothesis. We show experimentally, that by using the proposed combination scheme, the mean error for unknown users could be reduced by up to 50% when combining the estimates from multiple cameras.