Audio-Visual Content Analysis for Content-Based Video Indexing

Authors:
Sofia Tsekeridou;Ioannis Pitas
Affiliations:
Aristotle University of Thessaloniki;Aristotle University of Thessaloniki
Venue:
ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
Year:
1999

Citing 0
Cited 3

Scene Change Detection Based on Audio-Visual Analysis and Interaction

Proceedings of the 10th International Workshop on Theoretical Foundations of Computer Vision: Multi-Image Analysis
Clustering of Imperfect Transcripts Using a Novel Similarity Measure

Information Retrieval Techniques for Speech Applications [this book is based on the workshop “Information Retrieval Techniques for Speech Applications”, held as part of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in New Orleans, USA, in September 2001].
Multimodal analysis of recorded video for e-learning

Proceedings of the 13th annual ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

An audio-visual content analysis method is presented, which analyzes both auditory and visual information sources and accounts for their inter-relations and coincidence to extract high-level semantic information. Both shot-based and object-based access to the visual information is employed. Due to the temporal nature of video, time has to be accounted for. Thus, time-constrained video labelling functions are generated. Audio source parsing leads to the extraction of a speaker identity mapping function over time. Visual source parsing results in the extraction of a talking face shot mapping function over time. Integration of the audio and visual mappings constrained by interaction rules leads to more detailed video content descriptions and even partial detection of its context.