Bayesian forecasting and dynamic models (2nd ed.)
Bayesian forecasting and dynamic models (2nd ed.)
CONDENSATION—Conditional Density Propagation forVisual Tracking
International Journal of Computer Vision
Statistical color models with application to skin detection
International Journal of Computer Vision
Computing 3-D head orientation from a monocular image sequence
FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
An Algorithm for Real-Time Stereo Vision Implementation of Head Pose and Gaze Direction Measurement
FG '00 Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition 2000
Tracking Focus of Attention in Meetings
ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Multiple View Geometry in Computer Vision
Multiple View Geometry in Computer Vision
Articulated Body Motion Capture by Stochastic Search
International Journal of Computer Vision
A joint particle filter for audio-visual speaker tracking
ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
Approximate Bayesian Multibody Tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence
Speaker tracking in seminars by human body detection
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
2D person tracking using Kalman filtering and adaptive background learning in a feedback loop
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Head Pose estimation on low resolution images
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Neural network-based head pose estimation and multi-view fusion
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Head pose estimation in seminar room using multi view face detectors
CLEAR'06 Proceedings of the 1st international evaluation conference on Classification of events, activities and relationships
Towards a bayesian approach to robust finding correspondences in multiple view geometry environments
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
A tutorial on particle filters for online nonlinear/non-GaussianBayesian tracking
IEEE Transactions on Signal Processing
Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings
IEEE Transactions on Audio, Speech, and Language Processing
Recognition of human head orientation based on artificial neural networks
IEEE Transactions on Neural Networks
Head Orientation Estimation Using Particle Filtering in Multiview Scenarios
Multimodal Technologies for Perception of Humans
Hi-index | 0.00 |
This article presents a multimodal approach to head pose estimation of individuals in environments equipped with multiple cameras and microphones, such as SmartRooms or automatic video conferencing. Determining the individuals head orientation is the basis for many forms of more sophisticated interactions between humans and technical devices and can also be used for automatic sensor selection (camera, microphone) in communications or video surveillance systems. The use of particle filters as a unified framework for the estimation of the head orientation for both monomodal and multimodal cases is proposed. In video, we estimate head orientation from color information by exploiting spatial redundancy among cameras. Audio information is processed to estimate the direction of the voice produced by a speaker making use of the directivity characteristics of the head radiation pattern. Furthermore, two different particle filter multimodal information fusion schemes for combining the audio and video streams are analyzed in terms of accuracy and robustness. In the first one, fusion is performed at a decision level by combining each monomodal head pose estimation, while the second one uses a joint estimation system combining information at data level. Experimental results conducted over the CLEAR 2006 evaluation database are reported and the comparison of the proposed multimodal head pose estimation algorithms with the reference monomodal approaches proves the effectiveness of the proposed approach.