Diversity in image retrieval: DCU at ImageCLEFPhoto 2008

  • Authors:
  • Neil O'Hare;Peter Wilkins;Cathal Gurrin;Eamonn Newman;Gareth J. F. Jones;Alan F. Smeaton

  • Affiliations:
  • Centre For Digital Video Processing, Dublin City University, Ireland;Centre For Digital Video Processing, Dublin City University, Ireland;Centre For Digital Video Processing, Dublin City University, Ireland and Centre for Sensor Web Technologies;Centre For Digital Video Processing, Dublin City University, Ireland;Centre For Digital Video Processing, Dublin City University, Ireland;Centre For Digital Video Processing, Dublin City University, Ireland and Centre for Sensor Web Technologies

  • Venue:
  • CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

DCU participated in the ImageCLEF 2008 photo retrieval task, which aimed to evaluate diversity in Image Retrieval, submitting runs for both the English and Random language annotation conditions. Our approaches used text-based and image-based retrieval to give baseline runs, with the the highest-ranked images from these baseline runs clustered using K-Means clustering of the text annotations, with representative images from each cluster ranked for the final submission. For random language annotations, we compared results from translated runs with untranslated runs. Our results show that combining image and text outperforms text alone and image alone, both for general retrieval performance and for diversity. Our baseline image and text runs give our best overall balance between retrieval and diversity; indeed, our baseline text and image run was the 2nd best automatic run for ImageCLEF 2008 Photographic Retrieval task. We found that clustering consistently gives a large improvement in diversity performance over the baseline, unclustered results, while degrading retrieval performance. Pseudo relevance feedback consistently improved retrieval, but always at the cost of diversity. We also found that the diversity of untranslated random runs was quite close to that of translated random runs, indicating that for this dataset at least, if diversity is our main concern it may not be necessary to translate the image annotations.