Fusion vs. two-stage for multimodal retrieval

  • Authors:
  • Avi Arampatzis;Konstantinos Zagoris;Savvas A. Chatzichristofis

  • Affiliations:
  • Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece;Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece;Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece

  • Venue:
  • ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We compare two methods for retrieval from multimodal collections. The first is a score-based fusion of results, retrieved visually and textually. The second is a two-stage method that visually re-ranks the top-K results textually retrieved. We discuss their underlying hypotheses and practical limitations, and contact a comparative evaluation on a standardized snapshot of Wikipedia. Both methods are found to be significantly more effective than single-modality baselines, with no clear winner but with different robustness features. Nevertheless, two-stage retrieval provides efficiency benefits over fusion.