Document expansion for image retrieval

  • Authors:
  • Jinming Min;Johannes Leveling;Dong Zhou;Gareth J. F. Jones

  • Affiliations:
  • Dublin City University, Dublin, Ireland;Dublin City University, Dublin, Ireland;University of Dublin, Dublin, Ireland;Dublin City University, Dublin, Ireland

  • Venue:
  • RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Successful information retrieval requires effective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of these documents. One potential approach to address problems of query-document term mismatch is document expansion to include additional topically relevant indexing terms in a document which may encourage its retrieval when relevant to queries which do not match its original contents well. We propose and evaluate a new document expansion method using external resources. While results of previous research have been inconclusive in determining the impact of document expansion on retrieval effectiveness, our method is shown to work effectively for text-based image retrieval of short image annotation documents. Our approach uses the Okapi query expansion algorithm as a method for document expansion. We further show improved performance can be achieved by using a "document reduction" approach to include only the significant terms in a document in the expansion process. Our experiments on the WikipediaMM task at ImageCLEF 2008 show an increase of 16.5% in mean average precision (MAP) compared to a variation of Okapi BM25 retrieval model. To compare document expansion with query expansion, we also test query expansion from an external resource which leads an improvement by 9.84% in MAP over our baseline. Our conclusion is that the document expansion with document reduction and in combination with query expansion produces the overall best retrieval results for short-length document retrieval. For this image retrieval task, we also conclude that query expansion from external resources does not outperform the document expansion method.