Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

Authors:
Mohamed E. Hussein;Marwan Torki;Mohammad A. Gowayyed;Motaz El-Saban
Affiliations:
Department of Computer and Systems Engineering, Alexandria University, Alexandria, Egypt;Department of Computer and Systems Engineering, Alexandria University, Alexandria, Egypt;Department of Computer and Systems Engineering, Alexandria University, Alexandria, Egypt;Microsoft Research Advanced Technology Lab Cairo, Cairo, Egypt
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 13
Cited 1

The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Pedestrian Detection via Classification on Riemannian Manifolds

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discriminative human action recognition in the learned hierarchical manifold space

Image and Vision Computing
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Region covariance: a fast descriptor for detection and classification

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part II
Real-time human pose recognition in parts from single depth images

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Instructing people for training gestural interactive systems

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Mining actionlet ensemble for action recognition with depth cameras

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Robust 3d action recognition with random occupancy patterns

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

International Journal of Computer Vision
Spatio-temporal covariance descriptors for action and gesture recognition

WACV '13 Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV)

A multi-modal gesture recognition system using audio, video, and skeletal joint data

Proceedings of the 15th ACM on International conference on multimodal interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Human action recognition from videos is a challenging machine vision task with multiple important application domains, such as human-robot/machine interaction, interactive entertainment, multimedia information retrieval, and surveillance. In this paper, we present a novel approach to human action recognition from 3D skeleton sequences extracted from depth data. We use the covariance matrix for skeleton joint locations over time as a discriminative descriptor for a sequence. To encode the relationship between joint movement and time, we deploy multiple covariance matrices over sub-sequences in a hierarchical fashion. The descriptor has a fixed length that is independent from the length of the described sequence. Our experiments show that using the covariance descriptor with an off-the-shelf classification algorithm outperforms the state of the art in action recognition on multiple datasets, captured either via a Kinect-type sensor or a sophisticated motion capture system. We also include an evaluation on a novel large dataset using our own annotation.