Combining global, regional and contextual features for automatic image annotation

  • Authors:
  • Yong Wang;Tao Mei;Shaogang Gong;Xian-Sheng Hua

  • Affiliations:
  • Department of Computer Science, Queen Mary, University of London, London E1 4NS, UK;Microsoft Research Asia, Beijing 100190, PR China;Department of Computer Science, Queen Mary, University of London, London E1 4NS, UK;Microsoft Research Asia, Beijing 100190, PR China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper presents a novel approach to automatic image annotation which combines global, regional, and contextual features by an extended cross-media relevance model. Unlike typical image annotation methods which use either global or regional features exclusively, as well as neglect the textual context information among the annotated words, the proposed approach incorporates the three kinds of information which are helpful to describe image semantics to annotate images by estimating their joint probability. Specifically, we describe the global features as a distribution vector of visual topics and model the textual context as a multinomial distribution. The global features provide the global distribution of visual topics over an image, while the textual context relaxes the assumption of mutual independence among annotated words which is commonly adopted in most existing methods. Both the global features and textual context are learned by a probability latent semantic analysis approach from the training data. The experiments over 5k Corel images have shown that combining these three kinds of information is beneficial in image annotation.