Canonical contextual distance for large-scale image annotation and retrieval

  • Authors:
  • Hideki Nakayama;Tatsuya Harada;Yasuo Kuniyoshi

  • Affiliations:
  • The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan

  • Venue:
  • LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

To realize generic image recognition, the system needs to learn an enormous amount of targets in the world and their appearances. Therefore, visual knowledge acquisition using massive amounts of web images has been studied recently, and search-based methods are now flourishing in this research field. However, in general, search process of such methods are conducted using similarity measures based on simple image features and suffer from the semantic-gap. This is a big problem and can be a bottleneck of the entire systems. In this paper, we propose a method of image annotation and retrieval based on the new similarity measure, Canonical Contextual Distance. This method effectively uses contexts of images estimated from multiple labels and learns the essential and discriminative latent space. Using the probabilistic structure, our similarity measure can reflect both appearance and semantics of samples. Because our learning method is highly scalable, it is even effective in a large web-scale dataset. Therefore, our similarity measure will be helpful to many other search-based methods. In the experiment, we show that our method outperforms previous works using the standard Corel benchmark. Next, we verify our method by applying it to 3.5 million web images.