View-invariant action recognition using interest points
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
A survey on vision-based human action recognition
Image and Vision Computing
Action recognition based on learnt motion semantic vocabulary
PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Representing pairwise spatial and temporal relations for action recognition
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Discovering motion patterns for human action recognition
PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
Human activity analysis: A review
ACM Computing Surveys (CSUR)
Unsupervised discovery of activity correlations using latent topic models
Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing
A survey of vision-based methods for action representation, segmentation and recognition
Computer Vision and Image Understanding
Event detection and recognition for semantic annotation of video
Multimedia Tools and Applications
Unsupervised action classification using space-time link analysis
Journal on Image and Video Processing
Online learning for PLSA-based visual recognition
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part II
Semantics extraction from images
Knowledge-driven multimedia information extraction and ontology evolution
Bag of spatio-temporal synonym sets for human action recognition
MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Representing feature quantization approach using spatial-temporal relation for action recognition
PerMIn'12 Proceedings of the First Indo-Japan conference on Perception and Machine Intelligence
Spatio-Temporal phrases for activity recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Visual code-sentences: a new video representation based on image descriptor sequences
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
A unified framework for multi-target tracking and collective activity recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
Real-Time exact graph matching with application in human action recognition
HBU'12 Proceedings of the Third international conference on Human Behavior Understanding
Action disambiguation analysis using normalized google-like distance correlogram
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
Modeling multi-object interactions using "string of feature graphs"
Computer Vision and Image Understanding
Computer Vision and Image Understanding
Editor's Choice Article: Human activity recognition in videos using a single example
Image and Vision Computing
Advanced Engineering Informatics
Hi-index | 0.00 |
Spatial-temporal local motion features have shown promising results in complex human action classification. Most of the previous works [6],[16],[21] treat these spatial-temporal features as a bag of video words, omitting any long range, global information in either the spatial or temporal domain. Other ways of learning temporal signature of motion tend to impose a fixed trajectory of the features or parts of human body returned by tracking algorithms. This leaves little flexibility for the algorithm to learn the optimal temporal pattern describing these motions. In this paper, we propose the usage of spatial-temporal correlograms to encode flexible long range temporal information into the spatial-temporal motion features. This results into a much richer description of human actions. We then apply an unsupervised generative model to learn different classes of human actions from these ST-correlograms. KTH dataset, one of the most challenging and popular human action dataset, is used for experimental evaluation. Our algorithm achieves the highest classification accuracy reported for this dataset under an unsupervised learning scheme.