Human action recognition via multi-view learning

Authors:
Tianzhu Zhang;Si Liu;Changsheng Xu;Hanqing Lu
Affiliations:
Institute of Automation, Beijing, P.R. China and China-Singapore Institute of Digital Media, Singapore, Singapore;Institute of Automation, Beijing, P. R. China and China-Singapore Institute of Digital Media, Singapore, Singapore;Institute of Automation, Beijing, P. R. China and China-Singapore Institute of Digital Media, Singapore, Singapore;Institute of Automation, Beijing, P. R. China and China-Singapore Institute of Digital Media, Singapore, Singapore
Venue:
ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
Year:
2010

Citing 9
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Human Activity Recognition with Metric Learning

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
An iterative image registration technique with an application to stereo vision

IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image segmentation using co-EM strategy

ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a novel approach to automatically learn a compact and yet discriminative representation for humane action recognition. Considering the static visual information and motion information, each frame is represented in two feature subsets (views) and Gaussian Mixture Model (GMM) is adopted to model the distributions of those features. In order to complement the strengths of the different features (views), a Co-EM based multiview learning framework is introduced to estimate the parameters of GMM instead of conventional single view based EM. Then Gaussian components are considered as video words to describe videos with different time resolutions. Compared with the traditional method to recognize action, there are several advantages with the proposed method using Co-EM strategy. First, complex actions are efficiently modeled by GMM, and the number of its component is automatically determined with the Minimum Description Length (MDL). Second, because the imperfectness of single view can be compensated by the other view in the Co-EM, the resulting bag of video words are superior to that formed by any single view. To the best of our knowledge, we are the first to try the Co-EM based multi-view learning method for action recognition and obtain significantly better results. We extensively verify our proposed approach on two publicly available challenging datasets: the KTH dataset and Weizmann dataset. The experimental results show the validity of our proposed method.