Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Recognizing Human Actions: A Local SVM Approach
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 3 - Volume 03
International Journal of Computer Vision
The challenge problem for automated detection of 101 semantic concepts in multimedia
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Learning script knowledge with web experiments
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Modeling temporal structure of decomposable motion segments for activity classification
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Evaluating knowledge transfer and zero-shot learning in a large-scale setting
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Understanding egocentric activities
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
A selective spatio-temporal interest point detector for human action recognition in complex scenes
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Human action recognition by learning bases of action attributes and parts
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Objects as attributes for scene classification
ECCV'10 Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part I
3D object detection with multiple kinects
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume 2
Hi-index | 0.00 |
State-of-the-art human activity recognition methods build on discriminative learning which requires a representative training set for good performance. This leads to scalability issues for the recognition of large sets of highly diverse activities. In this paper we leverage the fact that many human activities are compositional and that the essential components of the activities can be obtained from textual descriptions or scripts. To share and transfer knowledge between composite activities we model them by a common set of attributes corresponding to basic actions and object participants. This attribute representation allows to incorporate script data that delivers new variations of a composite activity or even to unseen composite activities. In our experiments on 41 composite cooking tasks, we found that script data to successfully capture the high variability of composite activities. We show improvements in a supervised case where training data for all composite cooking tasks is available, but we are also able to recognize unseen composites by just using script data and without any manual video annotation.