A System for Learning Statistical Motion Patterns
IEEE Transactions on Pattern Analysis and Machine Intelligence
Analysis and query of person-vehicle interactions in homography domain
Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
A survey of advances in vision-based human motion capture and analysis
Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Multi-spectral and multi-perspective video arrays for driver body tracking and activity analysis
Computer Vision and Image Understanding
Understanding human interactions with track and body synergies (TBS) captured from multiple views
Computer Vision and Image Understanding
View-Invariant Human Action Detection Using Component-Wise HMM of Body Parts
AMDO '08 Proceedings of the 5th international conference on Articulated Motion and Deformable Objects
Discovering Constrained Substructures in Bayesian Trees Using the E.M. Algorithm
ICIAR '08 Proceedings of the 5th international conference on Image Analysis and Recognition
Semantic Representation and Recognition of Continued and Recursive Human Activities
International Journal of Computer Vision
Segmentation of human body parts using deformable triangulation
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans - Special issue on recent advances in biometrics
Advances in view-invariant human motion analysis: a review
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
IEEE Transactions on Information Technology in Biomedicine
Robust sequence alignment for actor-object interaction recognition: Discovering actor-object states
Computer Vision and Image Understanding
Proceedings of the 6th international conference on Human-robot interaction
Semantic classification of human behaviors in video surveillance systems
WSEAS Transactions on Computers
Video event description in scene context
Neurocomputing
Hi-index | 0.00 |
Understanding human behavior in video data is essential in numerous applications including surveillance, video annotation/retrieval, and human-computer interfaces. This paper describes a framework for recognizing human actions and interactions in video by using three levels of abstraction. At low level, the poses of individual body parts including head, torso, arms and legs are recognized using individual Bayesian networks (BNs), which are then integrated to obtain an overall body pose. At mid level, the actions of a single person are modeled using a dynamic Bayesian network (DBN) with temporal links between identical states of the Bayesian network at time t and t+1. At high level, the results of mid-level descriptions for each person are juxtaposed along a common time line to identify an interaction between two persons. The linguistic 'verb argument structure' is used to represent human action in terms of triplets. Spatial and temporal constraints are used for a decision tree to recognize specific interactions. A meaningful semantic description in terms of subject-verb-object is obtained. Our method provides a user-friendly natural-language description of several human interactions, and correctly describes positive, neutral, and negative interactions occurring between two persons. Example sequences of real persons are presented to illustrate the paradigm.