Temporal Classification of Natural Gesture and Application to Video Coding

Authors:
Andrew D. Wilson;Aaron F. Bobick;Justine Cassell
Affiliations:
-;-;-
Venue:
CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Year:
1997

Citing 0
Cited 10

Parametric Hidden Markov Models for Gesture Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Design of a digital library for human movement

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Hidden Markov models for modeling and recognizing gesture under variation

Hidden Markov models
Nonlinear PHMMs for the Interpretation of Parameterized Gesture

CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Physically interactive story environments

IBM Systems Journal
Turning lectures into comic books using linguistically salient gestures

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Automatic temporal segment detection and affect recognition from face and body display

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
When Pyramids Learned Walking

CIARP '09 Proceedings of the 14th Iberoamerican Conference on Pattern Recognition: Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Multi-layered hand and face tracking for real-time gesture recognition

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

A method for the temporal classification of natural gesture from video imagery is presented. The work is motivated by recent developments in the theory of natural gesture which have identified several key temporal aspects of gesture important to communication. In particular, gesticulation during conversation can be coarsely characterized as periods of bi-phasic or tri-phasic gesture separated by a rest state. We first present an automatic procedure for hypothesizing plausible rest state configurations of a speaker. Second, we develop a state-based parsing algorithm used to both select among candidate rest states and to parse an incoming video stream into bi-phasic and tri-phasic gestures. Finally, we demonstrate the use of the bi-phasic/tri-phasic labeling to select semantically significant static images for low bandwidth coding of video of story-telling speakers.