Human Action Segmentation and Recognition Using Discriminative Semi-Markov Models

Authors:
Qinfeng Shi;Li Cheng;Li Wang;Alex Smola
Affiliations:
University of Adelaide, Adelaide, Australia;Bioinformatics Institute, A*STAR, Singapore, Singapore;Nanjing Forestry University, Nanjing, China;Yahoo! Research, Santa Clara, USA
Venue:
International Journal of Computer Vision
Year:
2011

Citing 20
Cited 2

The nature of statistical learning theory

The nature of statistical learning theory
Some PAC-Bayesian theorems

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
The visual analysis of human movement: a survey

Computer Vision and Image Understanding
PAC-Bayesian Stochastic Model Selection

Machine Learning
Shape Matching and Object Recognition Using Shape Contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Improved Predictive Accuracy Bound for Averaging Classifiers

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Coupled hidden Markov models for complex action recognition

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Computable Shell Decomposition Bounds

The Journal of Machine Learning Research
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Efficient Visual Event Detection Using Volumetric Features

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Conditional Random Fields for Contextual Human Motion Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
An Online Discriminative Approach to Background Subtraction

AVSS '06 Proceedings of the IEEE International Conference on Video and Signal Based Surveillance
A survey of advances in vision-based human motion capture and analysis

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
A scalable modular convex solver for regularized risk minimization

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
PAC-Bayesian learning of linear classifiers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Recognition and segmentation of 3-d human action using HMM and multi-class adaboost

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV
Identification of humans using gait

IEEE Transactions on Image Processing

Special Issue on Probabilistic Models for Image Understanding, Part II

International Journal of Computer Vision
Hand gesture recognition with motion tracking on spatial-temporal filtering

Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry

Quantified Score

Hi-index	0.00

Visualization

Abstract

A challenging problem in human action understanding is to jointly segment and recognize human actions from an unseen video sequence, where one person performs a sequence of continuous actions.In this paper, we propose a discriminative semi-Markov model approach, and define a set of features over boundary frames, segments, as well as neighboring segments. This enable us to conveniently capture a combination of local and global features that best represent each specific action type. To efficiently solve the inference problem of simultaneous segmentation and recognition, a Viterbi-like dynamic programming algorithm is utilized, which in practice is able to process 20 frames per second. Moreover, the model is discriminatively learned from large margin principle, and is formulated as an optimization problem with exponentially many constraints. To solve it efficiently, we present two different optimization algorithms, namely cutting plane method and bundle method, and demonstrate that each can be alternatively deployed in a "plug and play" fashion. From its theoretical aspect, we also analyze the generalization error of the proposed approach and provide a PAC-Bayes bound.The proposed approach is evaluated on a variety of datasets, and is shown to perform competitively to the state-of-the-art methods. For example, on KTH dataset, it achieves 95.0% recognition accuracy, where the best known result on this dataset is 93.4% (Reddy and Shah in ICCV, 2009).