Evaluation of local descriptors for action recognition in videos

Authors:
Piotr Bilinski;Francois Bremond
Affiliations:
INRIA Sophia Antipolis, Sophia Antipolis Cedex, France;INRIA Sophia Antipolis, Sophia Antipolis Cedex, France
Venue:
ICVS'11 Proceedings of the 8th international conference on Computer vision systems
Year:
2011

Citing 13
Cited 2

Making large-scale support vector machine learning practical

Advances in kernel methods
Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Histograms of Oriented Gradients for Human Detection

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
On Space-Time Interest Points

International Journal of Computer Vision
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Actions as Space-Time Shapes

IEEE Transactions on Pattern Analysis and Machine Intelligence
Speeded-Up Robust Features (SURF)

Computer Vision and Image Understanding
An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part II
PCA-SIFT: a more distinctive representation for local image descriptors

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Machine learning for high-speed corner detection

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part I

Translating related words to videos and back through latent topics

Proceedings of the sixth ACM international conference on Web search and data mining
Human gesture recognition on product manifolds

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, local descriptors have drawn a lot of attention as a representation method for action recognition. They are able to capture appearance and motion. They are robust to viewpoint and scale changes. They are easy to implement and quick to calculate. Moreover, they have shown to obtain good performance for action classification in videos. Over the last years, many different local spatio-temporal descriptors have been proposed. They are usually tested on different datasets and using different experimental methods. Moreover, experiments are done making assumptions that do not allow to fully evaluate descriptors. In this paper, we present a full evaluation of local spatio-temporal descriptors for action recognition in videos. Four widely used in state-of-the-art approaches descriptors and four video datasets were chosen. HOG, HOF, HOG-HOF and HOG3D were tested under a framework based on the bag-of-words model and Support Vector Machines.