View-invariant modeling and recognition of human actions using grammars

Authors:
Abhijit S. Ogale;Alap Karapurkar;Yiannis Aloimonos
Affiliations:
Computer Vision Laboratory, Dept. of Computer Science, University of Maryland, College Park, MD;Computer Vision Laboratory, Dept. of Computer Science, University of Maryland, College Park, MD;Computer Vision Laboratory, Dept. of Computer Science, University of Maryland, College Park, MD
Venue:
WDV'05/WDV'06/ICCV'05/ECCV'06 Proceedings of the 2005/2006 international conference on Dynamical vision
Year:
2006

Citing 11
Cited 26

Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discovery and Segmentation of Activities in Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions

International Journal of Computer Vision
View-Invariant Representation and Recognition of Actions

International Journal of Computer Vision
Recognizing and Tracking Human Action

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
Learning and Recognizing Human Dynamics in Video Sequences

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
A Reliable-Inference Framework for Recognition of Human Actions

AVSS '03 Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance
Human Motion: Modeling and Recognition of Actions and Interactions

3DPVT '04 Proceedings of the 3D Data Processing, Visualization, and Transmission, 2nd International Symposium
Hierarchical recognition of daily human actions based on continuous hidden Markov models

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition
An FFT-based technique for translation, rotation, and scale-invariant image registration

IEEE Transactions on Image Processing

A sensory grammar for inferring behaviors in sensor networks

Proceedings of the 5th international conference on Information processing in sensor networks
Boosted string representation and its application to video surveillance

Pattern Recognition
Extracting spatiotemporal human activity patterns in assisted living using a home sensor network

Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
View-invariant action recognition using interest points

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
ViHASi: virtual human action silhouette data for the performance evaluation of silhouette-based action recognition methods

VNBA '08 Proceedings of the 1st ACM workshop on Vision networks for behavior analysis
Learning atomic human actions using variable-length Markov models

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in video

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Human Activity Recognition Based on $\Re$ Transform and Fourier Mellin Transform

ISVC '09 Proceedings of the 5th International Symposium on Advances in Visual Computing: Part II
View-Invariant Human Action Recognition Using Exemplar-Based Hidden Markov Models

ICIRA '09 Proceedings of the 2nd International Conference on Intelligent Robotics and Applications
View indepedent human movement recognition from multi-view video exploiting a circular invariant posture representation

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
View-invariant human activity recognition based on shape and motion features

International Journal of Robotics and Automation
A survey on vision-based human action recognition

Image and Vision Computing
n-grams of action primitives for recognizing human behavior

CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Behavior histograms for action recognition and human detection

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
Advances in view-invariant human motion analysis: a review

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
View and style-independent action manifolds for human activity recognition

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
View invariant activity recognition with manifold learning

ISVC'10 Proceedings of the 6th international conference on Advances in visual computing - Volume Part II
An unsupervised framework for action recognition using actemes

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
Survey on classifying human actions through visual sensors

Artificial Intelligence Review
Human action recognition using a fast learning fully complex-valued classifier

Neurocomputing
3D Scene interpretation by combining probability theory and logic: The tower of knowledge

Computer Vision and Image Understanding
Sensory grammars for sensor networks

Journal of Ambient Intelligence and Smart Environments
Intelligent multi-camera video surveillance: A review

Pattern Recognition Letters
Subject independent human action recognition using spatio-depth information and meta-cognitive RBF network

Engineering Applications of Artificial Intelligence
Temporal segmentation and assignment of successive actions in a long-term video

Pattern Recognition Letters
A syntactic approach to robot imitation learning using probabilistic activity grammars

Robotics and Autonomous Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we represent human actions as sentences generated by a language built on atomic body poses or phonemes. The knowledge of body pose is stored only implicitly as a set of silhouettes seen from multiple viewpoints; no explicit 3D poses or body models are used, and individual body parts are not identified. Actions and their constituent atomic poses are extracted from a set of multiview multiperson video sequences by an automatic key frame selection process, and are used to automatically construct a probabilistic context-free grammar (PCFG), which encodes the syntax of the actions. Given a new single viewpoint video, we can parse it to recognize actions and changes in viewpoint simultaneously. Experimental results are provided.