Finding images of difficult entities in the long tail

  • Authors:
  • Bilyana Taneva;Mouna Kacimi;Gerhard Weikum

  • Affiliations:
  • Max-Planck Institute for Informatics, Saarbrücken, Germany;Free University of Bozen-Bolzano, Bozen-Bolzano, Italy;Max-Planck Institute for Informatics, Saarbrücken, Germany

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

While images of famous people and places are abundant on the Internet, they are much harder to retrieve for less popular entities such as notable computer scientists or regionally interesting churches. Querying the entity names in image search engines yields large candidate lists, but they often have low precision and unsatisfactory recall. In this paper, we propose a principled model for finding images of rare or ambiguous named entities. We propose a set of efficient, light-weight algorithms for identifying entity-specific keyphrases from a given textual description of the entity, which we then use to score candidate images based on the matches of keyphrases in the underlying Web pages. Our experiments show the high precision-recall quality of our approach.