High-Level Concept Detection in Video Using a Region Thesaurus

Authors:
Evaggelos Spyrou;Yannis Avrithis
Affiliations:
Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens;Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens
Venue:
Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies
Year:
2007

Citing 3
Cited 1

Fusing MPEG-7 visual descriptors for image classification

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Color and texture descriptors

IEEE Transactions on Circuits and Systems for Video Technology
Support vector machines for histogram-based image classification

IEEE Transactions on Neural Networks

Performance evaluation of the combination of Compacted Dither Pattern Codes with Bhattacharyya classifier in video visual concept depiction

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work presents an approach on high-level semantic feature detection in video sequences. Keyframes are selected to represent the visual content of the shots. Then, low-level feature extraction is performed on the keyframes and a feature vector including color and texture features is formed. A region thesaurus that contains all the high-level features is constructed using a subtractive clustering method where each feature results as the centroid of a cluster. Then, a model vector that contains the distances from each region type is formed and a SVM detector is trained for each semantic concept. The presented approach is also extended using Latent Semantic Analysis as a further step to exploit co-occurrences of the regiontypes. High-level concepts detected are desert, vegetation, mountain, road, sky and snow within TV news bulletins. Experiments were performed with TRECVID 2005 development data.