The nature of statistical learning theory
The nature of statistical learning theory
Event Detection and Analysis from Video Streams
IEEE Transactions on Pattern Analysis and Machine Intelligence
Recognizing Action at a Distance
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Structure analysis of soccer video with domain knowledge and hidden Markov models
Pattern Recognition Letters - Video computing
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
Efficient Visual Event Detection Using Volumetric Features
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Live sports event detection based on broadcast video and web-casting text
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Towards optimal bag-of-features for object categorization and semantic video retrieval
Proceedings of the 6th ACM international conference on Image and video retrieval
Video event detection using motion relativity and visual relatedness
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Real-time human action recognition by luminance field trajectory analysis
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Personalized abstraction of broadcasted American football video by highlight selection
IEEE Transactions on Multimedia
Multimedia event-based video indexing using time intervals
IEEE Transactions on Multimedia
Video Semantic Event/Concept Detection Using a Subspace-Based Multimedia Data Mining Framework
IEEE Transactions on Multimedia
A survey on visual surveillance of object motion and behaviors
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Event detection in field sports video using audio-visual features and a support vector Machine
IEEE Transactions on Circuits and Systems for Video Technology
Modality Mixture Projections for Semantic Video Event Detection
IEEE Transactions on Circuits and Systems for Video Technology
Machine Recognition of Human Activities: A Survey
IEEE Transactions on Circuits and Systems for Video Technology
Action recognition with appearance-motion features and fast search trees
Computer Vision and Image Understanding
Boosted multi-class semi-supervised learning for human action recognition
Pattern Recognition
Finding the game flow from sports video
J-MRE '11 Proceedings of the 2011 joint ACM workshop on Modeling and representing events
Predicting human activities using spatio-temporal structure of interest points
Proceedings of the 20th ACM international conference on Multimedia
Recognizing actions using depth motion maps-based histograms of oriented gradients
Proceedings of the 20th ACM international conference on Multimedia
Learning latent spatio-temporal compositional model for human action recognition
Proceedings of the 21st ACM international conference on Multimedia
Discriminative two-level feature selection for realistic human action recognition
Journal of Visual Communication and Image Representation
Hi-index | 0.00 |
Event detection plays an essential role in video content analysis and remains a challenging open problem. In particular, the study on detecting human-related video events in complex scenes with both a crowd of people and dynamic motion is still limited. In this paper, we investigate detecting video events that involve elementary human actions, e.g. making cellphone call, putting an object down, and pointing to something, in complex scenes using a novel spatio-temporal descriptor based approach. A new spatio-temporal descriptor, which temporally integrates the statistics of a set of response maps of low-level features, e.g. image gradients and optical flows, in a space-time cube, is proposed to capture the characteristics of actions in terms of their appearance and motion patterns. Based on this kind of descriptors, the bag-of-words method is utilized to describe a human figure as a concise feature vector. Then, these features are employed to train SVM classifiers at multiple spatial pyramid levels to distinguish different actions. Finally, a Gaussian kernel based temporal filtering is conducted to segment the sequences of events from a video stream taking account of the temporal consistency of actions. The proposed approach is capable of tolerating spatial layout variations and local deformations of human actions due to diverse view angles and rough human figure alignment in complex scenes. Extensive experiments on the 50-hour video dataset of TRECVid 2008 event detection task demonstrate that our approach outperforms the well-known SIFT descriptor based methods and effectively detects video events in challenging real-world conditions.