Hierarchical multi-channel hidden semi Markov graphical models for activity recognition

Authors:
Pradeep Natarajan;Ramakant Nevatia
Affiliations:
Speech, Language and Multimedia Business Unit, Raytheon BBN Technologies, Cambridge, MA 02138, United States;Institute for Robotics and Intelligent Systems, University of Southern California, Los Angeles, CA 90089-0273, United States
Venue:
Computer Vision and Image Understanding
Year:
2013

Citing 23
Cited 0

Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing planned multiperson action

Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Large-Scale Event Detection Using Semi-Hidden Markov Models

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Activity Recognition and Abnormality Detection with the Switching Hidden Semi-Markov Model

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Conditional Random Fields for Contextual Human Motion Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Online decoding of Markov models under latency constraints

ICML '06 Proceedings of the 23rd international conference on Machine learning
Hidden Conditional Random Fields for Gesture Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Joint Recognition of Complex Events and Track Matching

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Recognition of Composite Human Activities through Context-Free Grammar Based Representation

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Coupled Hidden Semi Markov Models for Activity Recognition

WMVC '07 Proceedings of the IEEE Workshop on Motion and Video Computing
Actions as Space-Time Shapes

IEEE Transactions on Pattern Analysis and Machine Intelligence
Online, Real-time Tracking and Recognition of Human Actions

WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
Hierarchical hidden Markov models with general state hierarchy

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Policy recognition in the abstract hidden Markov model

Journal of Artificial Intelligence Research
Hierarchical multi-channel hidden semi Markov models

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Modeling state durations in hidden Markov models for automatic speech recognition

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Learning dynamics for exemplar-based gesture recognition

CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Stochastic Representation and Recognition of High-Level Group Activities

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recognizing human actions from a stream of unsegmented sensory observations is important for a number of applications such as surveillance and human-computer interaction. A wide range of graphical models have been proposed for these tasks, and are typically extensions of the generative hidden Markov models (HMMs) or their discriminative counterpart, conditional random fields (CRFs). These extensions typically address one of three key limitations in the basic HMM/CRF formalism - unrealistic models for the duration of a sub-event, not encoding interactions among multiple agents directly and not modeling the inherent hierarchical organization of activities. In our work, we present a family of graphical models that generalize such extensions and simultaneously model event duration, multi agent interactions and hierarchical structure. We also present general algorithms for efficient learning and inference in such models based on local variational approximations. We demonstrate the effectiveness of our framework by developing graphical models for applications in automatic sign language (ASL) recognition, and for gesture and action recognition in videos. Our methods show results comparable to state-of-the-art in the datasets we consider, while requiring far fewer training examples compared to low-level feature based methods.