Overview of VideoCLEF 2008: automatic generation of topic-based feeds for dual language audio-visual content

Authors:
Martha Larson;Eamonn Newman;Gareth J. F. Jones
Affiliations:
EEMCS, Delft University of Technology, Delft, Netherlands;Centre for Digital Video Processing, Dublin City University, Dublin 9, Ireland;Centre for Digital Video Processing, Dublin City University, Dublin 9, Ireland
Venue:
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Year:
2008

Citing 4
Cited 15

SVM Classification Using Sequences of Phonemes and Syllables

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Overview of the CLEF-2007 Cross-Language Speech Retrieval Track

Advances in Multilingual and Multimodal Information Retrieval
Annotation of heterogeneous multimedia content using automatic speech recognition

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia

Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
The participation payoff: challenges and opportunities for multimedia access in networked communities

Proceedings of the international conference on Multimedia information retrieval
MIRACLE at VideoCLEF 2008: topic identification and keyframe extraction in dual language videos

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
DCU at VideoClef 2008

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Using an information retrieval system for video classification

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
VideoCLEF 2008: ASR classification with Wikipedia categories

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Narrative theme navigation for sitcoms supported by fan-generated scripts

Proceedings of the 3rd international workshop on Automated information extraction in media production
Overview of VideoCLEF 2009: new perspectives on speech-based multimedia content enrichment

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Using support vector machines as learning algorithm for video categorization

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Video classification as IR task: experiments and observations

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Using web sources for improving video categorization

Journal of Intelligent Information Systems
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval
Generating web-based corpora for video transcripts categorization

Expert Systems with Applications: An International Journal
Narrative theme navigation for sitcoms supported by fan-generated scripts

Multimedia Tools and Applications
Retrieval of high-dimensional visual data: current state, trends and challenges ahead

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF piloted the Vid2RSS task, whose main subtask was the classification of dual language video (Dutch-language television content featuring English-speaking experts and studio guests). The task offered two additional discretionary subtasks: feed translation and automatic keyframe extraction. Task participants were supplied with Dutch archival metadata, Dutch speech transcripts, English speech transcripts and ten thematic category labels, which they were required to assign to the test set videos. The videos were grouped by class label into topic-based RSS-feeds, displaying title, description and keyframe for each video. Five groups participated in the 2008 VideoCLEF track. Participants were required to collect their own training data; both Wikipedia and general web content were used. Groups deployed various classifiers (SVM, Naive Bayes and k-NN) or treated the problem as an information retrieval task. Both the Dutch speech transcripts and the archival metadata performed well as sources of indexing features, but no group succeeded in exploiting combinations of feature sources to significantly enhance performance. A small scale fluency/adequacy evaluation of the translation task output revealed the translation to be of sufficient quality to make it valuable to a non-Dutch speaking English speaker. For keyframe extraction, the strategy chosen was to select the keyframe from the shot with the most representative speech transcript content. The automatically selected shots were shown, with a small user study, to be competitive with manually selected shots. Future years of VideoCLEF will aim to expand the corpus and the class label list, as well as to extend the track to additional tasks.