Modal keywords, ontologies, and reasoning for video understanding

Authors:
Alejandro Jaimes;Belle L. Tseng;John R. Smith
Affiliations:
Pervasive Media Management, IBM T. J. Watson Research Center, Hawthorne, NY;Pervasive Media Management, IBM T. J. Watson Research Center, Hawthorne, NY;Pervasive Media Management, IBM T. J. Watson Research Center, Hawthorne, NY
Venue:
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Year:
2003

Citing 14
Cited 16

The symbol grounding problem

CNLS '89 Proceedings of the ninth annual international conference of the Center for Nonlinear Studies on Self-organizing, Collective, and Cooperative Phenomena in Natural and Artificial Computing Networks on Emergent computation
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
WordNet: a lexical database for English

Communications of the ACM
Natural-language retrieval of images based on descriptive captions

ACM Transactions on Information Systems (TOIS)
Experiments on using semantic distances between words in image caption retrieval

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Using semantic contents and WordNet in image retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Expert Systems: Design and Development

Expert Systems: Design and Development
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Ontology-Based Photo Annotation

IEEE Intelligent Systems
Marie-4: A High-Recall, Self-Improving Web Crawler That Finds Images Using Captions

IEEE Intelligent Systems
Audio Structuring and Personalized Retrieval Using Ontologies

ADL '00 Proceedings of the IEEE Advances in Digital Libraries 2000
Conceptual structures and computational methods for indexing and organization of visual information

Conceptual structures and computational methods for indexing and organization of visual information
Semi-automatic, data-driven construction of multimedia ontologies

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Factor graph framework for semantic video indexing

IEEE Transactions on Circuits and Systems for Video Technology

Concept-oriented video skimming and adaptation via semantic classification

Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Automatic video annotation using ontologies extended with visual information

Proceedings of the 13th annual ACM international conference on Multimedia
Enhanced ontologies for video annotation and retrieval

Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval
Building concept ontology for medical video annotation

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Dynamic pictorial ontologies for video digital libraries annotation

Workshop on multimedia information retrieval on The many faces of multimedia semantics
Multimedia enriched ontologies for video digital libraries

International Journal of Parallel, Emergent and Distributed Systems
Video Semantic Content Analysis Framework Based on Ontology Combined MPEG-7

Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
The e-Sentencias Prototype: A Procedural Ontology for Legal Multimedia Applications in the Spanish Civil Courts

Proceedings of the 2009 conference on Law, Ontologies and the Semantic Web: Channelling the Legal Information Flood
Reasoning with very expressive fuzzy description logics

Journal of Artificial Intelligence Research
A distributed, service-based framework for knowledge applications with multimedia

ACM Transactions on Information Systems (TOIS)
The state of the art in image and video retrieval

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Multimedia ontology based computational framework for video annotation and retrieval

MCAM'07 Proceedings of the 2007 international conference on Multimedia content analysis and mining
Semantic representation of multimedia content

Knowledge-driven multimedia information extraction and ontology evolution
Domain knowledge extension with pictorially enriched ontologies

CAIP'05 Proceedings of the 11th international conference on Computer Analysis of Images and Patterns
Multimedia research challenges for industry

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Using knowledge representation languages for video annotation and retrieval

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We proposed a novel framework for video content understanding that uses rules constructed from knowledge bases and multimedia ontologies. Our framework consists of an expert system that uses a rule-based engine, domain knowledge, visual detectors (for objects and scenes), and metadata (text from automatic speech recognition, related text, etc.). We introduce the idea of modal keywords, which are keywords that represent perceptual concepts in the following categories: visual (e.g., sky), aural (e.g., scream), olfactory (e.g., vanilla), tactile (e.g., feather), and taste (e.g., candy). A method is presented to automatically classify keywords from speech recognition, queries, or related text into these categories using WordNet and TGM I. For video understanding, the following operations are performed automatically: scene cut detection, automatic speech recognition, feature extraction, and visual detection (e.g., sky, face, indoor). These operation results are used in our system by a rule-based engine that uses context information (e.g., text from speech) to enhance visual detection results. We discuss semi-automatic construction of multimedia ontologies and present experiments in which visual detector outputs are modified by simple rules that use context information available with the video.