Mining actionlet ensemble for action recognition with depth cameras

Authors:
Ying Wu
Affiliations:
Northwestern University
Venue:
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Year:
2012

Citing 0
Cited 17

Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

International Journal of Computer Vision
Optimal joint selection for skeletal data from RGB-D devices using a genetic algorithm

MICAI'12 Proceedings of the 11th Mexican international conference on Advances in Computational Intelligence - Volume Part II
Online human gesture recognition from motion data streams

Proceedings of the 21st ACM international conference on Multimedia
Human-virtual human interaction by upper body gesture understanding

Proceedings of the 19th ACM Symposium on Virtual Reality Software and Technology
Online RGB-D gesture recognition with extreme learning machines

Proceedings of the 15th ACM on International conference on multimodal interaction
Ongoing human action recognition with motion capture

Pattern Recognition
A survey of human motion analysis using depth imagery

Pattern Recognition Letters
Evolutionary joint selection to improve human action recognition with RGB-D devices

Expert Systems with Applications: An International Journal
Histogram of oriented displacements (HOD): describing trajectories of human joints for action recognition

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Learning discriminative representations from RGB-D video data

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
A multi-objective evolutionary algorithm-based ensemble optimizer for feature selection and classification with neural network models

Neurocomputing
Depth sensor assisted real-time gesture recognition for interactive presentation

Journal of Visual Communication and Image Representation
Effective 3D action recognition using EigenJoints

Journal of Visual Communication and Image Representation
Human activity recognition using multi-features and multiple kernel learning

Pattern Recognition
Matching mixtures of curves for human action recognition

Computer Vision and Image Understanding
Recognizing object manipulation activities using depth and visual cues

Journal of Visual Communication and Image Representation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations in the actions. In this paper, an actionlet ensemble model is learnt to represent each action and to capture the intra-class variance. In addition, novel features that are suitable for depth data are proposed. They are robust to noise, invariant to translational and temporal misalignments, and capable of characterizing both the human motion and the human-object interactions. The proposed approach is evaluated on two challenging action recognition datasets captured by commodity depth cameras, and another dataset captured by a MoCap system. The experimental evaluations show that the proposed approach achieves superior performance to the state of the art algorithms.