Learning to recognize daily actions using gaze

Authors:
Alireza Fathi;Yin Li;James M. Rehg
Affiliations:
College of Computing, Georgia Institute of Technology;College of Computing, Georgia Institute of Technology;College of Computing, Georgia Institute of Technology
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Year:
2012

Citing 12
Cited 1

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Computational Perception of Scene Dynamics

ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume II - Volume II
An Interactive Computer Vision System DyPERS: Dynamic Personal Enhanced Reality System

ICVS '99 Proceedings of the First International Conference on Computer Vision Systems
A Statistical Approach to Texture Classification from Single Images

International Journal of Computer Vision - Special Issue on Texture Analysis and Synthesis
Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A hybrid discriminative/generative approach for modeling human activities

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning to recognize objects in egocentric activities

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Fast unsupervised ego-action learning for first-person sports videos

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Social interactions: A first-person perspective

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Discovering important people and objects for egocentric video summarization

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Detecting activities of daily living in first-person camera views

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Understanding egocentric activities

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Modeling instrumental activities of daily living in egocentric vision as sequences of active objects and context for alzheimer disease research

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a probabilistic generative model for simultaneously recognizing daily actions and predicting gaze locations in videos recorded from an egocentric camera. We focus on activities requiring eye-hand coordination and model the spatio-temporal relationship between the gaze point, the scene objects, and the action label. Our model captures the fact that the distribution of both visual features and object occurrences in the vicinity of the gaze point is correlated with the verb-object pair describing the action. It explicitly incorporates known properties of gaze behavior from the psychology literature, such as the temporal delay between fixation and manipulation events. We present an inference method that can predict the best sequence of gaze locations and the associated action label from an input sequence of images. We demonstrate improvements in action recognition rates and gaze prediction accuracy relative to state-of-the-art methods, on two new datasets that contain egocentric videos of daily activities and gaze.