Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Recognizing Action at a Distance
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Human Activity Recognition with Metric Learning
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part I
An iterative image registration technique with an application to stereo vision
IJCAI'81 Proceedings of the 7th international joint conference on Artificial intelligence - Volume 2
Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Image segmentation using co-EM strategy
ACCV'07 Proceedings of the 8th Asian conference on Computer vision - Volume Part II
Hi-index | 0.00 |
In this paper, we propose a novel approach to automatically learn a compact and yet discriminative representation for humane action recognition. Considering the static visual information and motion information, each frame is represented in two feature subsets (views) and Gaussian Mixture Model (GMM) is adopted to model the distributions of those features. In order to complement the strengths of the different features (views), a Co-EM based multiview learning framework is introduced to estimate the parameters of GMM instead of conventional single view based EM. Then Gaussian components are considered as video words to describe videos with different time resolutions. Compared with the traditional method to recognize action, there are several advantages with the proposed method using Co-EM strategy. First, complex actions are efficiently modeled by GMM, and the number of its component is automatically determined with the Minimum Description Length (MDL). Second, because the imperfectness of single view can be compensated by the other view in the Co-EM, the resulting bag of video words are superior to that formed by any single view. To the best of our knowledge, we are the first to try the Co-EM based multi-view learning method for action recognition and obtain significantly better results. We extensively verify our proposed approach on two publicly available challenging datasets: the KTH dataset and Weizmann dataset. The experimental results show the validity of our proposed method.