Classifying web videos using a global video descriptor

Authors:
Berkan Solmaz;Shayan Modiri Assari;Mubarak Shah
Affiliations:
University of Central Florida, Orlando, USA 32816;University of Central Florida, Orlando, USA 32816;University of Central Florida, Orlando, USA 32816
Venue:
Machine Vision and Applications
Year:
2013

Citing 17
Cited 0

Detection and Recognition of Periodic, Nonrigid Motion

International Journal of Computer Vision
The Recognition of Human Movement Using Temporal Templates

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
On Space-Time Interest Points

International Journal of Computer Vision
Behavior recognition via sparse spatio-temporal features

ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
A 3-dimensional sift descriptor and its application to action recognition

Proceedings of the 15th international conference on Multimedia
A differential geometric approach to representing the human actions

Computer Vision and Image Understanding
A survey on vision-based human action recognition

Image and Vision Computing
Object, scene and actions: combining multiple features for human action recognition

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
A survey of vision-based methods for action representation, segmentation and recognition

Computer Vision and Image Understanding
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Action Recognition Using Mined Hierarchical Compound Features

IEEE Transactions on Pattern Analysis and Machine Intelligence
Action recognition by dense trajectories

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Global semantic classification of scenes using power spectrum templates

IM'99 Proceedings of the 1999 international conference on Challenge of Image Retrieval
Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
HMDB: A large video database for human motion recognition

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing descriptors for videos is a crucial task in computer vision. In this paper, we propose a global video descriptor for classification of videos. Our method, bypasses the detection of interest points, the extraction of local video descriptors and the quantization of descriptors into a code book; it represents each video sequence as a single feature vector. Our global descriptor is computed by applying a bank of 3-D spatio-temporal filters on the frequency spectrum of a video sequence; hence, it integrates the information about the motion and scene structure. We tested our approach on three datasets, KTH (Schuldt et al., Proceedings of the 17th international conference on, pattern recognition (ICPR'04), vol. 3, pp. 32---36, 2004), UCF50 (http://vision.eecs.ucf.edu/datasetsActions.html) and HMDB51 (Kuehne et al., HMDB: a large video database for human motion recognition, 2011), and obtained promising results which demonstrate the robustness and the discriminative power of our global video descriptor for classifying videos of various actions. In addition, the combination of our global descriptor and a local descriptor resulted in the highest classification accuracies on UCF50 and HMDB51 datasets.