Bag-of-visual-words vs global image descriptors on two-stage multimodal retrieval

Authors:
Konstantinos Zagoris;Savvas A. Chatzichristofis;Avi Arampatzis
Affiliations:
Democritus University of Thrace, Xanthi, Greece;Democritus University of Thrace, Xanthi, Greece;Democritus University of Thrace, Xanthi, Greece
Venue:
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Year:
2011

Citing 4
Cited 0

Where to stop reading a ranked list?: threshold optimization using truncated score distributions

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
TOP-SURF: a visual words toolkit

Proceedings of the international conference on Multimedia
Dynamic two-stage image retrieval from large multimodal databases

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Fusion vs. two-stage for multimodal retrieval

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Bag-Of-Visual-Words (BOVW) paradigm is fast becoming a popular image representation for Content-Based Image Retrieval (CBIR), mainly because of its better retrieval effectiveness over global feature representations on collections with images being near-duplicate to queries. In this experimental study we demonstrate that this advantage of BOVW is diminished when visual diversity is enhanced by using a secondary modality, such as text, to pre-filter images. The TOP-SURF descriptor is evaluated against Compact Composite Descriptors on a two-stage image retrieval setup, which first uses a text modality to rank the collection and then perform CBIR only on the top-K items.