Retrieval of multimedia objects by combining semantic information from visual and textual descriptors

Authors:
Mats Sjöberg;Jorma Laaksonen;Matti Pöllä;Timo Honkela
Affiliations:
Laboratory of Computer and Information Science, Helsinki University of Technology, HUT, Finland;Laboratory of Computer and Information Science, Helsinki University of Technology, HUT, Finland;Laboratory of Computer and Information Science, Helsinki University of Technology, HUT, Finland;Laboratory of Computer and Information Science, Helsinki University of Technology, HUT, Finland
Venue:
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Year:
2006

Citing 4
Cited 2

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

Readings in speech recognition
Self-Organizing Maps

Self-Organizing Maps
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
PicSOM-self-organizing image retrieval with MPEG-7 content descriptors

IEEE Transactions on Neural Networks

Inferring semantics from textual information in multimedia retrieval

Neurocomputing
Automatic semantic indexing of medical images using a web ontology language for case-based image retrieval

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a method of content-based multimedia retrieval of objects with visual, aural and textual properties. In our method, training examples of objects belonging to a specific semantic class are associated with their low-level visual descriptors (such as MPEG-7) and textual features such as frequencies of significant keywords. A fuzzy mapping of a semantic class in the training set to a class of similar objects in the test set is created by using Self-Organizing Maps (SOMs) trained from automatically extracted low-level descriptors. We have performed several experiments with different textual features to evaluate the potential of our approach in bridging the gap from visual features to semantic concepts by the use textual presentations. Our initial results show a promising increase in retrieval performance.