Sparse B-spline polynomial descriptors for human activity recognition

Authors:
Antonios Oikonomopoulos;Maja Pantic;Ioannis Patras
Affiliations:
Department of Computing, Imperial College London, UK;Department of Computing, Imperial College London, UK and Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, The Netherlands;Department of Electronic Engineering, Queen Mary University of London, UK
Venue:
Image and Vision Computing
Year:
2009

Citing 38
Cited 6

On the unification of line processes, outlier rejection, and robust statistics with applications in early vision

International Journal of Computer Vision
Human motion analysis: a review

Computer Vision and Image Understanding
The Quotient Image: Class-Based Re-Rendering and Recognition with Varying Illuminations

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Shape Matching and Object Recognition Using Shape Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
The Template Update Problem

IEEE Transactions on Pattern Analysis and Machine Intelligence
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Space-Time Behavior Based Correlation

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Object Recognition with Features Inspired by Visual Cortex

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
Exploring the Space of a Human Action

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Recognizing Human Actions in Videos Acquired by Uncalibrated Moving Cameras

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Efficient Visual Event Detection Using Volumetric Features

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Discovering Objects and their Localization in Images

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Spatial Weighting for Bag-of-Features

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Model-Based Hand Tracking Using a Hierarchical Bayesian Filter

IEEE Transactions on Pattern Analysis and Machine Intelligence
Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition

International Journal of Computer Vision
Tracking People by Learning Their Appearance

IEEE Transactions on Pattern Analysis and Machine Intelligence
A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Vision-based human motion analysis: An overview

Computer Vision and Image Understanding
Human action recognition using shape and CLG-motion flow from multi-view image sequences

Pattern Recognition
Activity representation using 3D shape models

Journal on Image and Video Processing - Anthropocentric Video Analysis: Tools and Applications
3D shape-encoded particle filter for object tracking and its application to human body tracking

Journal on Image and Video Processing - Anthropocentric Video Analysis: Tools and Applications
Monocular 3D tracking of articulated human motion in silhouette and pose manifolds

Journal on Image and Video Processing - Anthropocentric Video Analysis: Tools and Applications
Finding Actions Using Shape Flows

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
Human body gesture recognition using adapted auxiliary particle filtering

AVSS '07 Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance
Human action recognition using distribution of oriented rectangular patches

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
Behavior histograms for action recognition and human detection

Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
Human computing and machine understanding of human behavior: a survey

ICMI'06/IJCAI'07 Proceedings of the ICMI 2006 and IJCAI 2007 international conference on Artifical intelligence for human computing
Sharing features: efficient boosting procedures for multiclass object detection

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Hyperfeatures – multilevel local coding for visual recognition

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I
A boundary-fragment-model for object detection

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Gesture Recognition: A Survey

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Spatiotemporal salient points for visual recognition of human actions

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Editorial: Visual and multimodal analysis of human spontaneous behaviour: Introduction to the Special Issue

Image and Vision Computing
A survey on vision-based human action recognition

Image and Vision Computing
Fast and accurate global motion compensation

Pattern Recognition
Towards the automatic detection of spontaneous agreement and disagreement based on nonverbal behaviour: A survey of related cues, databases, and tools

Image and Vision Computing
A review of motion analysis methods for human Nonverbal Communication Computing

Image and Vision Computing
Application of 3D-wavelet statistics to video analysis

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The extraction and quantization of local image and video descriptors for the subsequent creation of visual codebooks is a technique that has proved very effective for image and video retrieval applications. In this paper we build on this concept and propose a new set of visual descriptors that provide a local space-time description of the visual activity. The proposed descriptors are extracted at spatiotemporal salient points detected on the estimated optical flow field for a given image sequence and are based on geometrical properties of three-dimensional piecewise polynomials, namely B-splines. The latter are fitted on the spatiotemporal locations of salient points that fall within a given spatiotemporal neighborhood. Our descriptors are invariant in translation and scaling in space-time. The latter is ensured by coupling the neighborhood dimensions to the scale at which the corresponding spatiotemporal salient points are detected. In addition, in order to provide robustness against camera motion (e.g. global translation due to camera panning) we subtract the motion component that is estimated by applying local median filters on the optical flow field. The descriptors that are extracted across the whole dataset are clustered in order to create a codebook of 'visual verbs', where each verb corresponds to a cluster center. We use the resulting codebook in a 'bag of verbs' approach in order to represent the motion of the subjects within small temporal windows. Finally, we use a boosting algorithm in order to select the most discriminative temporal windows of each class and Relevance Vector Machines (RVM) for classification. The presented results using three different databases of human actions verify the effectiveness of our method.