Video story segmentation with multi-modal features: experiments on TRECvid 2003

Authors:
Laurent Besacier;Georges Quénot;Stéphane Ayache;Daniel Moraru
Affiliations:
CLIPS / IMAG, Grenoble cedex;CLIPS / IMAG, Grenoble cedex;CLIPS / IMAG, Grenoble cedex;CLIPS / IMAG, Grenoble cedex
Venue:
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Year:
2004

Citing 2
Cited 1

DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
The LIMSI Broadcast News transcription system

Speech Communication - Special issue on automatic transcription of broadcast news data

Vlogging: A survey of videoblogging technology on the web

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the first steps of CLIPS/IMAG on the TREC video story segmentation task. We mostly describe the multi-modal features used and their respective performance for the story segmentation task. These features are based on the audio, video and text modalities. The preliminary system, which has the advantage to be relatively free with respect to the use of training data, is also presented in this paper. First experiments on the TRECVID 2003 evaluation set lead to a recall rate of 0.613 and a precision rate of 0.467. We plan to participate to the official TRECVID 2004 story segmentation task with this system