Fast content-based retrieval from online photo sharing sites

  • Authors:
  • Gerald Schaefer;David Edmundson

  • Affiliations:
  • Department of Computer Science, Loughborough University, Loughborough, U.K.;Department of Computer Science, Loughborough University, Loughborough, U.K.

  • Venue:
  • AMT'12 Proceedings of the 8th international conference on Active Media Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Literally billions of images have been uploaded to photo sharing sites since their inception, comprising a staggering wealth of visual information. However, effective tools for querying these collections are rare and keyword based. Since users rarely annotate their images, this approach is only of limited use. Content-based image retrieval (CBIR) extracts features directly from images and bases searches on these features. However, conventional CBIR approaches require a dedicated system that performs feature extraction during photo upload and a database system to store the features, and are hence not available to the average user. In this paper, we present a very fast content-based retrieval method that performs feature extraction on-the-fly during the retrieval process and thus can be employed client-side on images downloaded from photo sharing sites such as Flickr. Our approach is based on the fact that images uploaded to Flickr are stored in a JPEG format optimised to minimise disk space and bandwidth usage. In particular, we exploit the optimised Huffman compression tables, which are stored in the JPEG headers, as image descriptors. Since, in contrast to other approaches, we thus have to read only a fraction of the image file and similarity calculation is of low complexity, our approach is extremely fast as demonstrated by the bandwidth used to retrieve images from the Flickr photo sharing site. We also show that nevertheless retrieval performance is comparable to CBIR using colour histograms which is at the core of many CBIR systems.