Recognizing Interactive Group Activities Using Temporal Interaction Matrices and Their Riemannian Statistics

Authors:
Ruonan Li;Rama Chellappa;Shaohua Kevin Zhou
Affiliations:
Harvard School of Engineering and Applied Sciences, Cambridge, USA 02138;Center for Automation Research, UMIACS, and the Department of Electrical and Computer Engineering, University of Maryland, College Park, USA 20742;Corporate Research & Technology, Siemens Corporation, Princeton, USA 08540
Venue:
International Journal of Computer Vision
Year:
2013

Citing 32
Cited 0

Robust Real-Time Periodic Motion Detection, Analysis, and Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognition of Visual Activities and Interactions by Stochastic Parsing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing planned multiperson action

Computer Vision and Image Understanding - Modeling people toward vision-based underatanding of a person's shape, appearance, and movement
Recognition of Group Activities using Dynamic Probabilistic Networks

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Effective Feature Extraction for Play Detection in American Football Video

MMM '05 Proceedings of the 11th International Multimedia Modelling Conference
On Space-Time Interest Points

International Journal of Computer Vision
Detecting group activities using rigidity of formation

Proceedings of the 13th annual ACM international conference on Multimedia
Discriminative Learning of Mixture of Bayesian Network Classifiers for Sequence Classification

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Using camera motion to identify types of American football plays

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements

Journal of Mathematical Imaging and Vision
Object tracking: A survey

ACM Computing Surveys (CSUR)
A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Learning, detection and representation of multi-agent events in videos

Artificial Intelligence
A 3-dimensional sift descriptor and its application to action recognition

Proceedings of the 15th international conference on Multimedia
Shape-and-Behavior Encoded Tracking of Bee Dances

IEEE Transactions on Pattern Analysis and Machine Intelligence
A trajectory-based analysis of coordinated team activity in a basketball game

Computer Vision and Image Understanding
Detecting Semantic Group Activities Using Relational Clustering

WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
Event analysis based on multiple interactive motion trajectories

IEEE Transactions on Circuits and Systems for Video Technology
Multi-agent activity recognition using observation decomposedhidden Markov models

Image and Vision Computing
A survey on vision-based human action recognition

Image and Vision Computing
Object Detection with Discriminatively Trained Part-Based Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
View-Independent Action Recognition from Temporal Self-Similarities

IEEE Transactions on Pattern Analysis and Machine Intelligence
Human activity analysis: A review

ACM Computing Surveys (CSUR)
Stochastic Representation and Recognition of High-Level Group Activities

International Journal of Computer Vision
Learning context for collective activity recognition

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Multi-agent event recognition in structured scenarios

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Modeling individual and group actions in meetings with layered HMMs

IEEE Transactions on Multimedia
Semantic analysis of soccer video using dynamic Bayesian network

IEEE Transactions on Multimedia
"Shape Activity": a continuous-state HMM for moving/deforming shapes with application to abnormal activity detection

IEEE Transactions on Image Processing
A Multiple-Hypothesis Approach for Multiobject Visual Tracking

IEEE Transactions on Image Processing
Detecting regions of interest in dynamic scenes with camera motions

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
A chains model for localizing participants of group activities in videos

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

While video-based activity analysis and recognition has received much attention, a large body of existing work deals with activities of a single subject. Modeling and recognition of coordinated multi-subject activities, or group activities, present in a variety of applications such as surveillance, sports, and biological monitoring records, etc., is the main objective of this paper. Unlike earlier attempts which model the complex spatial temporal constraints among multiple subjects with a parametric Bayesian network, we propose a compact and discriminative descriptor referred to as the Temporal Interaction Matrix for representing a coordinated group motion pattern. Moreover, we characterize the space of the Temporal Interaction Matrices using the Discriminative Temporal Interaction Manifold (DTIM), and use it as a framework within which we develop a data-driven strategy to characterize the group motion pattern without employing specific domain knowledge. In particular, we establish probability densities on the DTIM for compactly describing the statistical properties of the coordinations and interactions among multiple subjects in a group activity. For each class of group activity, we learn a multi-modal density function on the DTIM. A Maximum a Posteriori (MAP) classifier on the manifold is then designed for recognizing new activities. In addition, we have extended this model to one with which we can explicitly distinguish the participants from non-participants. We demonstrate how the framework can be applied to motions represented by point trajectories as well as articulated human actions represented by images. Experiments on both cases show the effectiveness of the proposed approach.