Human daily action analysis with multi-view and color-depth data

Authors:
Zhongwei Cheng;Lei Qin;Yituo Ye;Qingming Huang;Qi Tian
Affiliations:
Graduate University of Chinese Academy of Sciences, Beijing, China;Key Lab of Intelli. Info. Process., ICT CAS, Beijing, China;Graduate University of Chinese Academy of Sciences, Beijing, China;Graduate University of Chinese Academy of Sciences, Beijing, China, Key Lab of Intelli. Info. Process., ICT CAS, Beijing, China;University of Texas at San Antonio, TX
Venue:
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
Year:
2012

Citing 7
Cited 0

Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
On Space-Time Interest Points

International Journal of Computer Vision
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
MuHAVi: A Multicamera Human Action Video Dataset for the Evaluation of Action Recognition Methods

AVSS '10 Proceedings of the 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance
Real-time human pose recognition in parts from single depth images

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Improving human action recognition in videos is restricted by the inherent limitations of the visual data. In this paper, we take the depth information into consideration and construct a novel dataset of human daily actions. The proposed ACT42 dataset provides synchronized data from 4 views and 2 sources, aiming to facilitate the research of action analysis across multiple views and multiple sources. We also propose a new descriptor of depth information for action representation, which depicts the structural relations of spatiotemporal points within action volume using the distance information in depth data. In experimental validation, our descriptor obtains superior performance to the state-of-the-art action descriptors designed for color information, and more robust to viewpoint variations. The fusion of features from different sources is also discussed, and a simple but efficient method is presented to provide a baseline performance on the proposed dataset.