Large Scale Concept Detection in Video Using a Region Thesaurus

Authors:
Evaggelos Spyrou;Giorgos Tolias;Yannis Avrithis
Affiliations:
Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece 157 80;Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece 157 80;Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece 157 80
Venue:
MMM '09 Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling
Year:
2009

Citing 6
Cited 0

A stochastic framework for optimal key frame extraction from MPEG video databases

Computer Vision and Image Understanding - Special issue on content-based access for image and video libraries
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
On the selection of MPEG-7 visual descriptors and their level of detail for nature disaster video sequences classification

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Fusing MPEG-7 visual descriptors for image classification

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Color and texture descriptors

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an approach on high-level feature detection within video documents, using a Region Thesaurus. A video shot is represented by a single keyframe and MPEG-7 features are extracted locally, from coarse segmented regions. Then a clustering algorithm is applied on those extracted regions and a region thesaurus is constructed to facilitate the description of each keyframe at a higher level than the low-level descriptors but at a lower than the high-level concepts. A model vector representation is formed and several high-level concept detectors are appropriately trained using a global keyframe annotation. The proposed approach is thoroughly evaluated on the TRECVID 2007 development data for the detection of nine high level concepts, demonstrating sufficient performance on large data sets.