Boosted key-frame selection and correlated pyramidal motion-feature representation for human action recognition

Authors:
Li Liu;Ling Shao;Peter Rockett
Affiliations:
Department of Electronic and Electrical Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, UK;Department of Electronic and Electrical Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, UK;Department of Electronic and Electrical Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, UK
Venue:
Pattern Recognition
Year:
2013

Citing 19
Cited 2

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Training Support Vector Machines: an Application to Face Detection

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Image Indexing Using Color Correlograms

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Recognizing Action at a Distance

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Face Verification Using GaborWavelets and AdaBoost

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
Free viewpoint action recognition using motion history volumes

Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Behavior recognition via sparse spatio-temporal features

ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words

International Journal of Computer Vision
Facial expression recognition based on Local Binary Patterns: A comprehensive study

Image and Vision Computing
More generality in efficient multiple kernel learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Human Action Recognition by Semilatent Topic Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Dynamic textures for human movement recognition

Proceedings of the ACM International Conference on Image and Video Retrieval
Automatic key pose selection for 3D human action recognition

AMDO'10 Proceedings of the 6th international conference on Articulated motion and deformable objects
A survey of vision-based methods for action representation, segmentation and recognition

Computer Vision and Image Understanding
Action recognition using context and appearance distribution features

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Fast Pedestrian Detection Using a Cascade of Boosted Covariance Features

IEEE Transactions on Circuits and Systems for Video Technology
HMDB: A large video database for human motion recognition

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Automatic extraction of relevant video shots of specific actions exploiting Web data

Computer Vision and Image Understanding
Multi-class boosting with asymmetric binary weak-learners

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper we propose a novel method for human action recognition based on boosted key-frame selection and correlated pyramidal motion feature representations. Instead of using an unsupervised method to detect interest points, a Pyramidal Motion Feature (PMF), which combines optical flow with a biologically inspired feature, is extracted from each frame of a video sequence. The AdaBoost learning algorithm is then applied to select the most discriminative frames from a large feature pool. In this way, we obtain the top-ranked boosted frames of each video sequence as the key frames which carry the most representative motion information. Furthermore, we utilise the correlogram which focuses not only on probabilistic distributions within one frame but also on the temporal relationships of the action sequence. In the classification phase, a Support-Vector Machine (SVM) is adopted as the final classifier for human action recognition. To demonstrate generalizability, our method has been systematically tested on a variety of datasets and shown to be more effective and accurate for action recognition compared to the previous work. We obtain overall accuracies of: 95.5%, 93.7%, and 36.5% with our proposed method on the KTH, the multiview IXMAS and the challenging HMDB51 datasets, respectively.