Editor's Choice Article: Human activity recognition in videos using a single example

Authors:
Mehrsan Javan Roshtkhari;Martin D. Levine
Affiliations:
-;-
Venue:
Image and Vision Computing
Year:
2013

Citing 44
Cited 0

View-Invariant Representation and Recognition of Actions

International Journal of Computer Vision
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Actions Sketch: A Novel Action Representation

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Spatial Weighting for Bag-of-Features

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Detecting Irregularities in Images and in Video

International Journal of Computer Vision
A tutorial on spectral clustering

Statistics and Computing
A 3-dimensional sift descriptor and its application to action recognition

Proceedings of the 15th international conference on Multimedia
Actions as Space-Time Shapes

IEEE Transactions on Pattern Analysis and Machine Intelligence
Space-Time Behavior-Based Correlation—OR—How to Tell If Two Underlying Motion Fields Are Similar Without Computing Them?

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
Scene modeling and change detection in dynamic scenes: A subspace approach

Computer Vision and Image Understanding
Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
A Scalable Framework For Segmenting Magnetic Resonance Images

Journal of Signal Processing Systems
Spatial-Temporal correlatons for unsupervised action classification

WMVC '08 Proceedings of the 2008 IEEE Workshop on Motion and video Computing
A survey on vision-based human action recognition

Image and Vision Computing
Real-time foreground-background segmentation using codebook model

Real-Time Imaging
Volumetric Features for Video Event Detection

International Journal of Computer Vision
Detecting unusual activity in video

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A survey of vision-based methods for action representation, segmentation and recognition

Computer Vision and Image Understanding
Action recognition with appearance-motion features and fast search trees

Computer Vision and Image Understanding
Action Recognition from One Example

IEEE Transactions on Pattern Analysis and Machine Intelligence
Action Recognition Using Mined Hierarchical Compound Features

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discriminative Video Pattern Search for Efficient Action Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
Recent advances and trends in visual tracking: A review

Neurocomputing
Multi-scale and real-time non-parametric approach for anomaly detection and localization

Computer Vision and Image Understanding
Selective spatio-temporal interest points

Computer Vision and Image Understanding
Integrating local action elements for action analysis

Computer Vision and Image Understanding
Recognizing Human Actions by Learning and Matching Shape-Motion Prototype Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised random forest indexing for fast action search

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Action recognition by dense trajectories

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Action recognition with multiscale spatio-temporal contexts

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Spatiotemporal Localization and Categorization of Human Actions in Unsegmented Image Sequences

IEEE Transactions on Image Processing
Machine Recognition of Human Activities: A Survey

IEEE Transactions on Circuits and Systems for Video Technology
Elastic Sequence Correlation for Human Action Analysis

IEEE Transactions on Image Processing
Mid-level features and spatio-temporal context for activity recognition

Pattern Recognition
A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

CRV '12 Proceedings of the 2012 Ninth Conference on Computer and Robot Vision
A flow model for joint action recognition and identity maintenance

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Action bank: A high-level representation of activity in video

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Spatio-Temporal phrases for activity recognition

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Online Dominant and Anomalous Behavior Detection in Videos

CVPR '13 Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition
An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions

Computer Vision and Image Understanding

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel approach for action recognition, localization and video matching based on a hierarchical codebook model of local spatio-temporal video volumes. Given a single example of an activity as a query video, the proposed method finds similar videos to the query in a target video dataset. The method is based on the bag of video words (BOV) representation and does not require prior knowledge about actions, background subtraction, motion estimation or tracking. It is also robust to spatial and temporal scale changes, as well as some deformations. The hierarchical algorithm codes a video as a compact set of spatio-temporal volumes, while considering their spatio-temporal compositions in order to account for spatial and temporal contextual information. This hierarchy is achieved by first constructing a codebook of spatio-temporal video volumes. Then a large contextual volume containing many spatio-temporal volumes (ensemble of volumes) is considered. These ensembles are used to construct a probabilistic model of video volumes and their spatio-temporal compositions. The algorithm was applied to three available video datasets for action recognition with different complexities (KTH, Weizmann, and MSR II) and the results were superior to other approaches, especially in the case of a single training example and cross-dataset action recognition.