Applied Artificial Intelligence
International Journal of Computer Vision
Context-based vision system for place and object recognition
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Video query: research directions
IBM Journal of Research and Development - Papers on mustimedia systems
Steps toward a cognitive vision system
AI Magazine
Manual and automatic evaluation of summaries
AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Human action recognition using star skeleton
Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
Automatic Learning of Conceptual Knowledge in Image Sequences for Human Behavior Interpretation
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Face detection and recognition of natural human emotion using Markov random fields
Personal and Ubiquitous Computing
Semantic Representation and Recognition of Continued and Recursive Human Activities
International Journal of Computer Vision
SimpleNLG: a realisation engine for practical applications
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision
Context based object categorization: A critical survey
Computer Vision and Image Understanding
A Novel Method for Efficient Indoor---Outdoor Image Classification
Journal of Signal Processing Systems
Emotion recognition from arbitrary view facial images
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Corpus-guided sentence generation of natural images
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
This contribution addresses generation of natural language descriptions for human actions, behaviour and their relations with other objects observed in video streams. The work starts with implementation of conventional image processing techniques to extract high level features from video. These features are converted into natural language descriptions using context free grammar. Although feature extraction processes are erroneous at various levels, we explore approaches to putting them together to produce a coherent description. Evaluation is made by calculating ROUGE scores between human annotated and machine generated descriptions. Further we introduce a task based evaluation by human subjects which provides qualitative evaluation of generated descriptions.