Natural Language Description of Image Sequences as a Form of Knowledge Representation

Authors:
Hans-Hellmut Nagel
Affiliations:
-
Venue:
KI '99 Proceedings of the 23rd Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
1999

Citing 10
Cited 2

Visual surveillance in a dynamic and uncertain world

Artificial Intelligence - Special volume on computer vision
Semantic networks for understanding scenes

Semantic networks for understanding scenes
Interpreting a dynamic and uncertain world: task-based control

Artificial Intelligence
Vehicles capable of dynamic vision: a new breed of technical beings?

Artificial Intelligence - Special issue: artificial intelligence 40 years later
Object identification: a Bayesian analysis with application to traffic surveillance

Artificial Intelligence - Special issue: artificial intelligence 40 years later
(Mis?-) Using DRT for Generation of Natural Language Text from Image Sequences

ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume II - Volume II
Qualitative Spatial Representation and Reasoning Techniques

KI '97 Proceedings of the 21st Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence
Bildbereichsbasierte Verfolgung von Straßenfahrzeugen durch adaptive Schätzung und Segmentierung von Optischen-Fluß-Feldern

Mustererkennung 1998, 20. DAGM-Symposium
Video Surveillance of Interactions

VS '99 Proceedings of the Second IEEE Workshop on Visual Surveillance
Building Qualitative Event Models Automatically from Visual Input

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision

Representation of Behavioral Knowledge for Planning and Plan-Recognition in a Cognitive Vision System

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
High-Level Expectations for Low-Level Image Processing

KI '08 Proceedings of the 31st annual German conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

An image sequence evaluation process combines information from different information sources. One of these sources is a camera which records a scene and provides the acquired information as a digitized image sequence. A different source provides knowledge regarding signal processing and geometry, exploited in order to map the image sequence signal to a system-internal representation of visible bodies and their movement in the depicted scene. Still another type of source provides abstract conceptual knowledge linking the system-internal geometric representation to tasks and goals of agents which act within the depicted scene or may influence it from the outside. Rather than providing this third type of information for inference engines by 'handcrafted' rules or sets of axioms, it is postulated that this type of knowledge should be derived by algorithmic analysis of a suitably formulated natural language text: natural language text is considered as a genuine represention of abstract knowledge for an image sequence evaluation process. This hypothesis is studied for the example of a system which transforms video sequences of road scenes into natural language text describing the recorded actual traffic.