Understanding Human Behaviors Based on Eye-Head-Hand Coordination

Authors:
Chen Yu;Dana H. Ballard
Affiliations:
-;-
Venue:
BMCV '02 Proceedings of the Second International Workshop on Biologically Motivated Computer Vision
Year:
2002

Citing 7
Cited 1

The computational perception of scene dynamics

Computer Vision and Image Understanding - Special issue on physics-based modeling and reasoning in computer vision
Seeded Region Growing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Subband-Based Speech Recognition

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Learning visual behavior for gesture analysis

ISCV '95 Proceedings of the International Symposium on Computer Vision
Learning to Recognize Human Action Sequences

ICDL '02 Proceedings of the 2nd International Conference on Development and Learning
The "Inverse hollywood problem": from video to scripts and storyboards via causal analysis

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

A multimodal learning interface for grounding spoken language in sensory perceptions

ACM Transactions on Applied Perception (TAP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Action recognition has traditionally focused on processing fixed camera observations while ignoring non-visual information. In this paper, we explore the dynamic properties of the movements of different body parts in natural tasks: eye, head and hand movements are quite tightly coupled with the ongoing task. In light of this, our method takes an agent-centered view and incorporates an extensive description of eye-head-hand coordination. With the ability to track the course of gaze and head movements, our approach uses gaze and head cues to detect agent-centered attention switches that can then be utilized to segment an action sequence into action units. Based on recognizing those action primitives, parallel hidden Markov models are applied to model and integrate the probabilistic sequences of the action units of different body parts. An experimental system is built for recognizing human behaviors in three natural tasks: "unscrewing a jar", "stapling a letter" and "pouring water", which demonstrates the effectiveness of the approach.