Object Retrieval Using Visual Query Context

  • Authors:
  • Linjun Yang;Bo Geng;Yang Cai;Alan Hanjalic;Xian-Sheng Hua

  • Affiliations:
  • Microsoft Research Asia, Beijing, China;Peking University, Beijing, China;Zhejiang University, Zhejiang, China;Delft University of Technology, Delft, Netherlands;Microsoft Research Asia, Beijing, China

  • Venue:
  • IEEE Transactions on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Object retrieval aims at retrieving images containing objects similar to the query object captured in the region of interest (ROI) of the query image. Boosted by the invention and wide popularity of SIFT image features and bag-of-visual-words image representation, object retrieval has progressed significantly in the past years and has already found deployment in real-life applications and products. While existing object retrieval methods perform well in many cases, they may fail to return satisfactory results if the ROI specified by the user is inaccurate or if the object captured there is too small to be represented using discriminative features and consequently to be matched with similar objects in the image collection. In order to improve the object retrieval performance also in these difficult cases, we propose in this paper an object retrieval method that exploits the information about the visual context of the query object and employ it to compensate for possible uncertainty in feature-based query object representation. Contextual information is drawn from the visual elements surrounding the query object in the query image. We consider the ROI as an uncertain observation of the latent search intent and the saliency map detected for the query image as a prior. Then a language modeling approach is employed to devise a contextual object retrieval (COR) model. There, the relevance score is determined based on the search intent scores that are inferred from the uncertain ROI and the saliency prior. The usefulness of the contextual information for object retrieval and the effectiveness of the proposed COR model are demonstrated and evaluated on three representative image datasets.