Integrating local action elements for action analysis

Authors:
Tuan Hue Thi;Li Cheng;Jian Zhang;Li Wang;Shinichi Satoh
Affiliations:
National ICT of Australia, Australia and School of Computer Science, University of New South Wales, Australia;Bioinformatics Institute, A*STAR, Singapore;National ICT of Australia, Australia and School of Computer Science, University of New South Wales, Australia;Information and Science Technology Institute, Nanjing Forest University, China;Multimedia Information Research Division, National Institute of Informatics, Japan
Venue:
Computer Vision and Image Understanding
Year:
2012

Citing 33
Cited 1

The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning a Sparse Representation for Object Detection

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Gait Analysis for Recognition and Classification

FGR '02 Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
On Space-Time Interest Points

International Journal of Computer Vision
Discovering Objects and their Localization in Images

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Integrating Representative and Discriminative Models for Object Category Detection

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Individual Recognition Using Gait Energy Image

IEEE Transactions on Pattern Analysis and Machine Intelligence
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Successive Convex Matching for Action Detection

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Free viewpoint action recognition using motion history volumes

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Local velocity-adapted motion events for spatio-temporal recognition

Computer Vision and Image Understanding
Learning to Recognize Objects with Little Supervision

International Journal of Computer Vision
Describing Visual Scenes Using Transformed Objects and Parts

International Journal of Computer Vision
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
Contextual motion field-based distance for video analysis

The Visual Computer: International Journal of Computer Graphics
Modeling the World from Internet Photo Collections

International Journal of Computer Vision
A Probabilistic Cascade of Detectors for Individual Object Recognition

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Human Action Recognition by Semilatent Topic Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Human Body Articulation for Action Recognition in Video Sequences

AVSS '09 Proceedings of the 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance
Radon representation-based feature descriptor for texture classification

IEEE Transactions on Image Processing
Shape representation and classification using the Poisson equation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Human Action Recognition and Localization in Video Using Structured Learning of Local Space-Time Features

AVSS '10 Proceedings of the 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance
Weakly Supervised Action Recognition Using Implicit Shape Models

ICPR '10 Proceedings of the 2010 20th International Conference on Pattern Recognition

Editor's Choice Article: Human activity recognition in videos using a single example

Image and Vision Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a framework for human action analysis from video footage. A video action sequence in our perspective is a dynamic structure of sparse local spatial-temporal patches termed action elements, so the problems of action analysis in video are carried out here based on the set of local characteristics as well as global shape of a prescribed action. We first detect a set of action elements that are the most compact entities of an action, then we extend the idea of Implicit Shape Model to space time, in order to properly integrate the spatial and temporal properties of these action elements. In particular, we consider two different recipes to construct action elements: one is to use a Sparse Bayesian Feature Classifier to choose action elements from all detected Spatial Temporal Interest Points, and is termed discriminative action elements. The other one detects affine invariant local features from the holistic Motion History Images, and picks up action elements according to their compactness scores, and is called generative action elements. Action elements detected from either way are then used to construct a voting space based on their local feature representations as well as their global configuration constraints. Our approach is evaluated in the two main contexts of current human action analysis challenges, action retrieval and action classification. Comprehensive experimental results show that our proposed framework marginally outperforms all existing state-of-the-arts techniques on a range of different datasets.