Topic modelling of clickthrough data in image search

  • Authors:
  • Donn Morrison;Theodora Tsikrika;Vera Hollink;Arjen P. Vries;Éric Bruno;Stéphane Marchand-Maillet

  • Affiliations:
  • Computer Vision and Multimedia Laboratory, University of Geneva, Geneva, Switzerland;Centrum Wiskunde & Informatica, Amsterdam, The Netherlands;Centrum Wiskunde & Informatica, Amsterdam, The Netherlands;Centrum Wiskunde & Informatica, Amsterdam, The Netherlands;Computer Vision and Multimedia Laboratory, University of Geneva, Geneva, Switzerland;Computer Vision and Multimedia Laboratory, University of Geneva, Geneva, Switzerland

  • Venue:
  • Multimedia Tools and Applications
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we explore the benefits of latent variable modelling of clickthrough data in the domain of image retrieval. Clicks in image search logs are regarded as implicit relevance judgements that express both user intent and important relations between selected documents. We posit that clickthrough data contains hidden topics and can be used to infer a lower dimensional latent space that can be subsequently employed to improve various aspects of the retrieval system. We use a subset of a clickthrough corpus from the image search portal of a news agency to evaluate several popular latent variable models in terms of their ability to model topics underlying queries. We demonstrate that latent variable modelling reveals underlying structure in clickthrough data and our results show that computing document similarities in the latent space improves retrieval effectiveness compared to computing similarities in the original query space. These results are compared with baselines using visual and textual features. We show performance substantially better than the visual baseline, which indicates that content-based image retrieval systems that do not exploit query logs could improve recall and precision by taking this historical data into account.