Instructing people for training gestural interactive systems
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Scene aligned pooling for complex video recognition
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Trajectory-Based modeling of human actions with motion reference points
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part V
Motion interchange patterns for action recognition in unconstrained videos
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Recognizing complex events using large margin joint low-level event model
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part IV
A survey of video datasets for human action and activity recognition
Computer Vision and Image Understanding
Egocentric activity monitoring and recovery
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
A comparative study of encoding, pooling and normalization methods for action recognition
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
Exploring dense trajectory feature and encoding methods for human interaction recognition
Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
We are not equally negative: fine-grained labeling for multimedia event detection
Proceedings of the 21st ACM international conference on Multimedia
Recognition of complex events in open-source web-scale videos: a bottom up approach
Proceedings of the 21st ACM international conference on Multimedia
Fall detection in multi-camera surveillance videos: experimentations and observations
Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
A cognitive assistive system for monitoring the use of home medical devices
Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare
ChAirGest: a challenge for multimodal mid-air gesture recognition for close HCI
Proceedings of the 15th ACM on International conference on multimodal interaction
Proceedings of the 10th European Conference on Visual Media Production
Common-sense reasoning for human action recognition
Pattern Recognition Letters
A local descriptor based on Laplacian pyramid coding for action recognition
Pattern Recognition Letters
Classifying web videos using a global video descriptor
Machine Vision and Applications
Automatic extraction of relevant video shots of specific actions exploiting Web data
Computer Vision and Image Understanding
Language-motivated approaches to action recognition
The Journal of Machine Learning Research
Activity representation with motion hierarchies
International Journal of Computer Vision
Journal of Electrical and Computer Engineering
Hi-index | 0.00 |
With nearly one billion online videos viewed everyday, an emerging new frontier in computer vision research is recognition and search in video. While much effort has been devoted to the collection and annotation of large scalable static image datasets containing thousands of image categories, human action datasets lag far behind. Current action recognition databases contain on the order of ten different action categories collected under fairly controlled conditions. State-of-the-art performance on these datasets is now near ceiling and thus there is a need for the design and creation of new benchmarks. To address this issue we collected the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube. We use this database to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions such as camera motion, viewpoint, video quality and occlusion.