Integrating vision, motion and language through mind
Artificial Intelligence Review - Special issue on integration of natural language and vision processing: recent advances
Association of Motion Verbs with Vehicle Movements Extracted from Dense Optical Flow Fields
ECCV '94 Proceedings of the Third European Conference-Volume II on Computer Vision - Volume II
Media Information Processing in Documents -Generation of Manuals of Mechanical Parts Assembling
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Integrating Vision and Language: Towards Automatic Description of Human Movements
KI '95 Proceedings of the 19th Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Conceptual taxonomy of Japanese verbs for understanding natural language and picture patterns
COLING '80 Proceedings of the 8th conference on Computational linguistics
Feedback of correcting information in postediting to a machine translation system
COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 2
Japanese-English translation through internal expressions
COLING '82 Proceedings of the 9th conference on Computational linguistics - Volume 1
Recognition of two-person interactions using a hierarchical Bayesian network
IWVS '03 First ACM SIGMM international workshop on Video surveillance
Steps toward a cognitive vision system
AI Magazine
Towards automatic analysis of social interaction patterns in a nursing home environment from video
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Detecting social interactions of the elderly in a nursing home environment
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Free viewpoint action recognition using motion history volumes
Computer Vision and Image Understanding - Special issue on modeling people: Vision-based understanding of a person's shape, appearance, movement, and behaviour
Boosted string representation and its application to video surveillance
Pattern Recognition
Interpretation of complex situations in a semantic-based surveillance framework
Image Communication
Automatic Learning of Conceptual Knowledge in Image Sequences for Human Behavior Interpretation
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Natural Language Descriptions of Human Behavior from Video Sequences
KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Multimedia ontology learning for automatic annotation and video browsing
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Understanding dynamic scenes based on human sequence evaluation
Image and Vision Computing
Automated visual surveillance in computer vision
AMTA'09 Proceedings of the 10th WSEAS international conference on Acoustics & music: theory & applications
Toward a Cooperative Recognition of Human Behaviors and Related Objects
Proceedings of the 2006 conference on Information Modelling and Knowledge Bases XVII
CASEE: a hierarchical event representation for the analysis of videos
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A self-referential perceptual inference framework for video interpretation
ICVS'03 Proceedings of the 3rd international conference on Computer vision systems
How many words is a picture worth? Automatic caption generation for news images
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Vision, logic, and language - toward analyzable encompassing systems
KI'10 Proceedings of the 33rd annual German conference on Advances in artificial intelligence
A survey of vision-based methods for action representation, segmentation and recognition
Computer Vision and Image Understanding
Computer Vision and Image Understanding
Augmenting video surveillance footage with virtual agents for incremental event evaluation
Pattern Recognition Letters
View-invariant modeling and recognition of human actions using grammars
WDV'05/WDV'06/ICCV'05/ECCV'06 Proceedings of the 2005/2006 international conference on Dynamical vision
Cognitive visual tracking and camera control
Computer Vision and Image Understanding
Complex activity representation and recognition by extended stochastic grammar
ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
From motion patterns to visual concepts for event analysis in dynamic scenes
ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part I
Describing video contents in natural language
HYBRID '12 Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
Automated textual descriptions for a wide range of video events with 48 human actions
ECCV'12 Proceedings of the 12th international conference on Computer Vision - Volume Part I
Hi-index | 0.00 |
We propose a method for describing human activities from video images based on concept hierarchies of actions. Major difficulty in transforming video images into textual descriptions is how to bridge a semantic gap between them, which is also known as inverse Hollywood problem. In general, the concepts of events or actions of human can be classified by semantic primitives. By associating these concepts with the semantic features extracted from video images, appropriate syntactic components such as verbs, objects, etc. are determined and then translated into natural language sentences. We also demonstrate the performance of the proposed method by several experiments.