A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
A Performance Evaluation of Local Descriptors
IEEE Transactions on Pattern Analysis and Machine Intelligence
International Journal of Computer Vision
Creating Efficient Codebooks for Visual Recognition
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
A Coherent Computational Approach to Model Bottom-Up Visual Attention
IEEE Transactions on Pattern Analysis and Machine Intelligence
Simultaneous Classification and VisualWord Selection using Entropy-based Minimum Description Length
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
2006 Special Issue: Modeling attention to salient proto-objects
Neural Networks
Behavior recognition via sparse spatio-temporal features
ICCCN '05 Proceedings of the 14th International Conference on Computer Communications and Networks
Evaluating bag-of-visual-words representations in scene classification
Proceedings of the international workshop on Workshop on multimedia information retrieval
A framework for flexible summarization of racquet sports video using multiple modalities
Computer Vision and Image Understanding
A simple method for detecting salient regions
Pattern Recognition
Descriptive visual words and visual phrases for image applications
MM '09 Proceedings of the 17th ACM international conference on Multimedia
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Robust region-of-interest determination based on user attention model through visual rhythm analysis
IEEE Transactions on Circuits and Systems for Video Technology
Content-based attention ranking using visual and contextual attention model for baseball videos
IEEE Transactions on Multimedia - Special issue on integration of context and content
A survey on vision-based human action recognition
Image and Vision Computing
Discriminative codeword selection for image representation
Proceedings of the international conference on Multimedia
A survey of vision-based methods for action representation, segmentation and recognition
Computer Vision and Image Understanding
Machine Recognition of Human Activities: A Survey
IEEE Transactions on Circuits and Systems for Video Technology
Using SAX representation for human action recognition
Journal of Visual Communication and Image Representation
Fast human action classification and VOI localization with enhanced sparse coding
Journal of Visual Communication and Image Representation
Human action recognition employing negative space features
Journal of Visual Communication and Image Representation
Hi-index | 0.00 |
Constructing the bag-of-features model from Space-time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.