Diversifying image search with user generated content

  • Authors:
  • Roelof van Zwol;Vanessa Murdock;Lluis Garcia Pueyo;Georgina Ramirez

  • Affiliations:
  • Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain

  • Venue:
  • MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale image retrieval on the Web relies on the availability of short snippets of text associated with the image. This user-generated content is a primary source of information about the content and context of an image. While traditional information retrieval models focus on finding the most relevant document without consideration for diversity, image search requires results that are both diverse and relevant. This is problematic for images because they are represented very sparsely by text, and as with all user-generated content the text for a given image can be extremely noisy. The contribution of this paper is twofold. First, we present a retrieval model which provides diverse results as a property of the model itself, rather than in a post-retrieval step. Relevance models offer a unified framework to afford the greatest diversity without harming precision. Second, we show that it is possible to minimize the trade-offs between precision and diversity, and estimating the query model from the distribution of tags favors the dominant sense of a query. Relevance models operating only on tags offers the highest level of diversity with no significant decrease in precision.