Human activities as stochastic kronecker graphs

Authors:
Sinisa Todorovic
Affiliations:
Oregon State University, Corvallis, Oregon
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Year:
2012

Citing 12
Cited 1

Learning Shape-Classes Using a Mixture of Tree-Unions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Video Epitomes

International Journal of Computer Vision
Unsupervised Category Modeling, Recognition, and Segmentation in Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Kronecker Graphs: An Approach to Modeling Networks

The Journal of Machine Learning Research
Modeling temporal structure of decomposable motion segments for activity classification

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Human activity analysis: A review

ACM Computing Surveys (CSUR)
A probabilistic representation for efficient large scale visual recognition tasks

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Discriminative Latent Models for Recognizing Contextual Group Activities

IEEE Transactions on Pattern Analysis and Machine Intelligence
Action bank: A high-level representation of activity in video

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Parsing video events with goal inference and intent prediction

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Learning spatiotemporal graphs of human activities

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Activity representation with motion hierarchies

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

A human activity can be viewed as a space-time repetition of activity primitives. Both instances of the primitives, and their repetition are stochastic. They can be modeled by a generative model-graph, where nodes correspond to the primitives, and the graph's adjacency matrix encodes their affinities for probabilistic grouping into observable video features. When a video of the activity is represented by a graph capturing the space-time layout of video features, such a video graph can be viewed as probabilistically sampled from the activity's model-graph. This sampling is formulated as a successive Kronecker multiplication of the model's affinity matrix. The resulting Kronecker-power matrix is taken as a noisy permutation of the adjacency matrix of the video graph. The paper presents our: 1) model-graph; 2) memory- and time-efficient, weakly supervised learning of activity primitives and their affinities; and 3) inference aimed at finding the best expected correspondences between the primitives and observed video features. Our results demonstrate good scalability on UCF50, and superior performance to that of the state of the art on individual, structured, and collective activities of UCF YouTube, Olympic, and Collective datasets.