Semantic video content annotation at the object level

  • Authors:
  • Vanessa El-Khoury;Martin Jergler;David Coquil;Harald Kosch

  • Affiliations:
  • University of Passau, Passau, Germany;University of Passau, Passau, Germany;University of Passau, Passau, Germany;University of Passau, Passau, Germany

  • Venue:
  • Proceedings of the 10th International Conference on Advances in Mobile Computing & Multimedia
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A vital prerequisite for fine-grained video content processing (indexing, querying, retrieval, adaptation, etc.) is the production of accurate metadata describing its structure and semantics. Several annotation tools were presented in the literature generating metadata at different granularities (i.e. scenes, shots, frames, objects). These tools have a number of limitations with respect to the annotation of objects. Though they provide functionalities to localize and annotate an object in a frame, the propagation of this information in the next frames still requires human intervention. Furthermore, they are based on video models that lack expressiveness along the spatial and semantic dimensions. To address these shortcomings, we propose the Semantic Video Content Annotation Tool (SVCAT) for structural and high-level semantic annotation. SVCAT is a semi-automatic annotation tool compliant with the MPEG-7 standard, which produces metadata according to an object-based video content model described in this paper. In particular, the novelty of SVCAT lies in its automatic propagation of the object localization and description metadata realized by tracking their contour through the video, thus drastically alleviating the task of the annotator. Experimental results show that SVCAT provides accurate metadata to object-based applications, particularly exact contours of multiple deformable objects.