Recognizing multitasked activities from video using stochastic context-free grammar

Authors:
Darnell Moore;Irfan Essa
Affiliations:
Texas Instruments, Video & Imaging Processing/DSP R&D Center, Dallas, TX;Georgia Institute of Technology, GVU Center/College of Computing, Atlanta, GA
Venue:
Eighteenth national conference on Artificial intelligence
Year:
2002

Citing 10
Cited 23

Bayesian learning of probabilistic language models

Bayesian learning of probabilistic language models
A State-Based Approach to the Representation and Recognition of Gesture

IEEE Transactions on Pattern Analysis and Machine Intelligence
Models of computation and formal languages

Models of computation and formal languages
Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
A framework for recognizing the simultaneous aspects of American sign language

Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Understanding manipulation in video

FG '96 Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96)
An efficient context-free parsing algorithm

An efficient context-free parsing algorithm
Precise n-gram probabilities from stochastic context-free grammars

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Protocols from perceptual observations

Artificial Intelligence - Special volume on connecting language to the world
Unsupervised analysis of activity sequences using event-motifs

Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
Extracting spatiotemporal human activity patterns in assisted living using a home sensor network

Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
Improving the recognition of interleaved activities

UbiComp '08 Proceedings of the 10th international conference on Ubiquitous computing
Semantic Representation and Recognition of Continued and Recursive Human Activities

International Journal of Computer Vision
Unusual Activity Analysis in Video Sequences

RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
The hidden permutation model and location-based activity recognition

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Protocols from perceptual observations

Artificial Intelligence - Special volume on connecting language to the world
Understanding video events: a survey of methods for automatic interpretation of semantic occurrences in video

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Learning to recognize video-based spatiotemporal events

IEEE Transactions on Intelligent Transportation Systems
A task-driven intelligent workspace system to provide guidance feedback

Computer Vision and Image Understanding
Querying parse trees of stochastic context-free grammars

Proceedings of the 13th International Conference on Database Theory
SVM-based multimodal classification of activities of daily living in health smart homes: sensors, algorithms, and first experimental results

IEEE Transactions on Information Technology in Biomedicine - Special section on affective and pervasive computing for healthcare
A cognitive vision system for action recognition in office environments

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Propagation networks for recognition of partially ordered sequential action

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Human activity analysis: A review

ACM Computing Surveys (CSUR)
Review: Situation identification techniques in pervasive computing: A review

Pervasive and Mobile Computing
Workflow activity monitoring using dynamics of pair-wise qualitative spatial relations

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Recognizing water-based activities in the home through infrastructure-mediated sensing

Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Location-based reasoning about complex multi-agent behavior

Journal of Artificial Intelligence Research
A Markov logic framework for recognizing complex events from multimodal data

Proceedings of the 15th ACM on International conference on multimodal interaction
Learning and parsing video events with goal and intent prediction

Computer Vision and Image Understanding
A syntactic approach to robot imitation learning using probabilistic activity grammars

Robotics and Autonomous Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present techniques for recognizing complex, multitasked activities from video. Visual information like image features and motion appearances, combined with domain-specific information, like object context is used initially to label events. Each action event is represented with a unique symbol, allowing for a sequence of interactions to be described as an ordered symbolic string. Then, a model of stochastic context-free grammar (SCFG), which is developed using underlying rules of an activity, is used to provide the structure for recognizing semantically meaningful behavior over extended periods. Symbolic strings are parsed using the Earley-Stolcke algorithm to determine the most likely semantic derivation for recognition. Parsing substrings allows us to recognize patterns that describe high-level, complex events taking place over segments of the video sequence. We introduce new parsing strategies to enable error detection and recovery in stochastic context-free grammar and methods of quantifying group and individual behavior in activities with separable roles. We show through experiments, with a popular card game, the recognition of high-level narratives of multi-player games and the identification of player strategies and behavior using computer vision.