Support vector machine learning for interdependent and structured output spaces
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Histograms of Oriented Gradients for Human Detection
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Randomized Trees for Real-Time Keypoint Recognition
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Cutting-plane training of structural SVMs
Machine Learning
An iterative image registration technique with an application to stereo vision
IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Estimating gaze direction from low-resolution faces in video
ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Gaze estimation from low resolution images
PSIVT'06 Proceedings of the First Pacific Rim conference on Advances in Image and Video Technology
Exploring STIP-based models for recognizing human interactions in TV videos
Pattern Recognition Letters
Human interaction categorization by using audio-visual cues
Machine Vision and Applications
Hi-index | 0.00 |
The central tenet of this paper is that by determining where people are looking, other tasks involved with understanding and interrogating a scene are simplified. To this end we describe a fully automatic method to determine a person's attention based on real-time visual tracking of their head and a coarse classification of their head pose. We estimate the head pose, or coarse gaze, using randomised ferns with decision branches based on both histograms of gradient orientations and colour based features. We use the coarse gaze for three applications to demonstrate its value: (i) we show how by building static and temporally varying maps of areas where people look we are able to identify interesting regions; (ii) we show how by determining the gaze of people in the scene we can more effectively control a multi-camera surveillance system to acquire faces for identification; (iii) we show how by identifying where people are looking we can more effectively classify human interactions.