Audio-Visual Content Analysis for Content-Based Video Indexing

  • Authors:
  • Sofia Tsekeridou;Ioannis Pitas

  • Affiliations:
  • Aristotle University of Thessaloniki;Aristotle University of Thessaloniki

  • Venue:
  • ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

An audio-visual content analysis method is presented, which analyzes both auditory and visual information sources and accounts for their inter-relations and coincidence to extract high-level semantic information. Both shot-based and object-based access to the visual information is employed. Due to the temporal nature of video, time has to be accounted for. Thus, time-constrained video labelling functions are generated. Audio source parsing leads to the extraction of a speaker identity mapping function over time. Visual source parsing results in the extraction of a talking face shot mapping function over time. Integration of the audio and visual mappings constrained by interaction rules leads to more detailed video content descriptions and even partial detection of its context.