Learning semantic distance from community-tagged media collection

  • Authors:
  • Guo-Jun Qi;Xian-Sheng Hua;Hong-Jiang Zhang

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Urbana, IL, USA;Microsoft Research Asia, Beijing, China;Microsoft Advanced Technology Center, Beijing, China

  • Venue:
  • MM '09 Proceedings of the 17th ACM international conference on Multimedia
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a novel semantic-aware distance metric for images by mining multimedia data on the Internet, in particular, web images and their associated tags. As well known, a proper distance metric between images is a key ingredient in many realistic web image retrieval engines, as well many image understanding techniques. In this paper, we attempt to mine a novel distance metric from the web images by integrating their visual content as well as the associated user tags. Different from many existing distance metric learning algorithms which utilize the dissimilar or similar information between images pixels or features in signal level, the proposed scheme also takes the associated user-input tags into consideration. The visual content of images is also leveraged to respect an intuitive assumption that the visual similar images ought to have a smaller distance. A semi-definite programming is formulated to encode the above two aspects of criteria to learn the distance metric and we show such an optimization problem can be efficiently solved with a closed-form solution. We evaluate the proposed algorithm on two datasets. One is the benchmark Corel dataset and the other is a real-world dataset crawled from the image sharing website Flickr. By comparison with other existing distance learning algorithms, competitive results are obtained by the proposed algorithm in experiments.