Refining image annotation using contextual relations between words

  • Authors:
  • Yong Wang;Shaogang Gong

  • Affiliations:
  • University of London, London, UK;University of London, London, UK

  • Venue:
  • Proceedings of the 6th ACM international conference on Image and video retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a probabilistic approach to refine image annotations by incorporating semantic relations between annotation words. Our approach firstly predicts a candidate set of annotation words with confidence scores. This is achieved by the relevance vector machine (RVM), which is a kernel based probabilistic classifier in order to cope with nonlinear classification. Given the candidate annotations, we model semantic relationships between words using a conditional random field (CRF) model where each vertex indicates the final decision (true / false) on a candidate annotation word. The refined annotation is given by inferring the most likely states of these vertexes. In the CRF model, we consider the confidence scores given by the RVM classifiers as local evidences. In addition, we utilise Normalized Google distances (NGD's) between two words as their contextual potential. NGD is a distance function between two words obtained by searching a pair of words using the Google search engine. It has a simple mathematical formulation with a foundation in Kolmogorov theory. We also propose a learning algorithm to tune the weight parameters in the CRF model. These weight parameters control the balance between the local evidence of a single word and the contextual relation between words. Our experiments on the Corel images demonstrate the effect of our approach.