Semantic combination of textual and visual information in multimedia retrieval

  • Authors:
  • Stéphane Clinchant;Julien Ah-Pine;Gabriela Csurka

  • Affiliations:
  • Xerox Research Centre Europe, chemin de Maupertuis, Meylan, France;University of Lyon, avenue Pierre Mendès France, Bron, France;Xerox Research Centre Europe, chemin de Maupertuis, Meylan, France

  • Venue:
  • Proceedings of the 1st ACM International Conference on Multimedia Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of this paper is to introduce a set of techniques we call semantic combination in order to efficiently fuse text and image retrieval systems in the context of multimedia information access. These techniques emerge from the observation that image and textual queries are expressed at different semantic levels and that a single image query is often ambiguous. Overall, the semantic combination techniques overcome a conceptual barrier rather than a technical one: these methods can be seen as a combination of late fusion and image reranking. Albeit simple, this approach has not been used yet. We assess the proposed techniques against late and cross-media fusion using 4 different ImageCLEF datasets. Compared to late fusion, performances significantly increase on two datasets and remain similar on the two other ones.