Bag-of-visual-words vs global image descriptors on two-stage multimodal retrieval

  • Authors:
  • Konstantinos Zagoris;Savvas A. Chatzichristofis;Avi Arampatzis

  • Affiliations:
  • Democritus University of Thrace, Xanthi, Greece;Democritus University of Thrace, Xanthi, Greece;Democritus University of Thrace, Xanthi, Greece

  • Venue:
  • Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Bag-Of-Visual-Words (BOVW) paradigm is fast becoming a popular image representation for Content-Based Image Retrieval (CBIR), mainly because of its better retrieval effectiveness over global feature representations on collections with images being near-duplicate to queries. In this experimental study we demonstrate that this advantage of BOVW is diminished when visual diversity is enhanced by using a secondary modality, such as text, to pre-filter images. The TOP-SURF descriptor is evaluated against Compact Composite Descriptors on a two-stage image retrieval setup, which first uses a text modality to rank the collection and then perform CBIR only on the top-K items.