Discriminative two-level feature selection for realistic human action recognition

Authors:
Qiuxia Wu;Zhiyong Wang;Feiqi Deng;Yong Xia;Wenxiong Kang;David Dagan Feng
Affiliations:
-;-;-;-;-;-
Venue:
Journal of Visual Communication and Image Representation
Year:
2013

Citing 26
Cited 0

A Model of Saliency-Based Visual Attention for Rapid Scene Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Feature Selection Using Feature Similarity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Space-time Interest Points

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
On Space-Time Interest Points

International Journal of Computer Vision
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Actions as Space-Time Shapes

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
A Coherent Computational Approach to Model Bottom-Up Visual Attention

IEEE Transactions on Pattern Analysis and Machine Intelligence
Simultaneous Classification and VisualWord Selection using Entropy-based Minimum Description Length

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
2006 Special Issue: Modeling attention to salient proto-objects

Neural Networks
Behavior recognition via sparse spatio-temporal features

ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
Evaluating bag-of-visual-words representations in scene classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
A framework for flexible summarization of racquet sports video using multiple modalities

Computer Vision and Image Understanding
A simple method for detecting salient regions

Pattern Recognition
Descriptive visual words and visual phrases for image applications

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Robust region-of-interest determination based on user attention model through visual rhythm analysis

IEEE Transactions on Circuits and Systems for Video Technology
Content-based attention ranking using visual and contextual attention model for baseball videos

IEEE Transactions on Multimedia - Special issue on integration of context and content
A survey on vision-based human action recognition

Image and Vision Computing
Discriminative codeword selection for image representation

Proceedings of the international conference on Multimedia
A survey of vision-based methods for action representation, segmentation and recognition

Computer Vision and Image Understanding
Machine Recognition of Human Activities: A Survey

IEEE Transactions on Circuits and Systems for Video Technology
Using SAX representation for human action recognition

Journal of Visual Communication and Image Representation
Fast human action classification and VOI localization with enhanced sparse coding

Journal of Visual Communication and Image Representation
Human action recognition employing negative space features

Journal of Visual Communication and Image Representation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Constructing the bag-of-features model from Space-time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.