Spatial-Temporal correlatons for unsupervised action classification

Authors:
Silvio Savarese;Andrey DelPozo;Juan Carlos Niebles; Li Fei-Fei
Affiliations:
Beckman Institute, University of Illinois at Urbana Champaign, USA;Dept. of Computer Science, University of Illinois Urbana-Champaign, USA;Dept. of Computer Science, Princeton University, USA/ Robotics and Intelligent Systems Group, Universidad del Norte, Colombia;Dept. of Computer Science, Princeton University, USA
Venue:
WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
Year:
2008

Citing 0
Cited 25

View-invariant action recognition using interest points

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
A survey on vision-based human action recognition

Image and Vision Computing
Incremental discriminant-analysis of canonical correlations for action recognition

Pattern Recognition
Action recognition based on learnt motion semantic vocabulary

PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Representing pairwise spatial and temporal relations for action recognition

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Discovering motion patterns for human action recognition

PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
Human activity analysis: A review

ACM Computing Surveys (CSUR)
Unsupervised discovery of activity correlations using latent topic models

Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
A survey of vision-based methods for action representation, segmentation and recognition

Computer Vision and Image Understanding
Event detection and recognition for semantic annotation of video

Multimedia Tools and Applications
Unsupervised action classification using space-time link analysis

Journal on Image and Video Processing
Online learning for PLSA-based visual recognition

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part II
Semantics extraction from images

Knowledge-driven multimedia information extraction and ontology evolution
Fusing appearance and distribution information of interest points for action recognition

Pattern Recognition
Bag of spatio-temporal synonym sets for human action recognition

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Representing feature quantization approach using spatial-temporal relation for action recognition

PerMIn'12 Proceedings of the First Indo-Japan conference on Perception and Machine Intelligence
Spatio-Temporal phrases for activity recognition

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Visual code-sentences: a new video representation based on image descriptor sequences

ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
A unified framework for multi-target tracking and collective activity recognition

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Real-Time exact graph matching with application in human action recognition

HBU'12 Proceedings of the Third international conference on Human Behavior Understanding
Action disambiguation analysis using normalized google-like distance correlogram

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
Modeling multi-object interactions using "string of feature graphs"

Computer Vision and Image Understanding
An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions

Computer Vision and Image Understanding
Editor's Choice Article: Human activity recognition in videos using a single example

Image and Vision Computing
Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers

Advanced Engineering Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Spatial-temporal local motion features have shown promising results in complex human action classification. Most of the previous works [6],[16],[21] treat these spatial-temporal features as a bag of video words, omitting any long range, global information in either the spatial or temporal domain. Other ways of learning temporal signature of motion tend to impose a fixed trajectory of the features or parts of human body returned by tracking algorithms. This leaves little flexibility for the algorithm to learn the optimal temporal pattern describing these motions. In this paper, we propose the usage of spatial-temporal correlograms to encode flexible long range temporal information into the spatial-temporal motion features. This results into a much richer description of human actions. We then apply an unsupervised generative model to learn different classes of human actions from these ST-correlograms. KTH dataset, one of the most challenging and popular human action dataset, is used for experimental evaluation. Our algorithm achieves the highest classification accuracy reported for this dataset under an unsupervised learning scheme.