Multimedia strategies for B3-SDR, based on principal component analysis

  • Authors:
  • Roelof van Zwol

  • Affiliations:
  • Department of Computer Science, Center for Content, and Knowledge Engineering, Utrecht University, Utrecht, The Netherlands

  • Venue:
  • INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article an XML-driven approach for multimedia information retrieval is presented and evaluated, which uses principal component analysis to derive a composite ranking for a set of XML elements that have a multimedia character. The multimedia strategies that implement the PCA module on top of the B3-SDR system allow for the integration of image retrieval with the already present text retrieval modules. Three different strategies are defined. The first strategy implements annotation-based image retrieval, which uses the caption of an image to find related images using a keyword-based search. The second component enables content-based multimedia retrieval by using PCA to derive a composite ranking, which reflects the combined relevance for text and images that are present within an XML element. A simple content-based image retrieval system is build for this purpose, which uses ‘query by example’. The last strategy allows for a bidirectional combination of the first two strategies, where the content-based image retrieval component benefits from the additional images retrieved by the annotation-based search, and vice versa. The multimedia strategies are evaluated within the INEX 2005 multimedia track, where based on the Lonelyplanet Worldguide and a set of related topics the retrieval performance is measured in terms of recall and precision. The outcome of the experiment shows that the multimedia strategies have a positive influence on the retrieval performance when compared to the text-based XML retrieval system. However, the PCA component did not yet fully live up to its expectation, which is probably due to the poor performance of the ad hoc build image retrieval system that is used for the experiment.