Exploiting multi-modal interactions: a unified framework

  • Authors:
  • Ming Li;Xiao-Bing Xue;Zhi-Hua Zhou

  • Affiliations:
  • National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China

  • Venue:
  • IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given an imagebase with tagged images, four types of tasks can be executed, i.e., content-based image retrieval, image annotation, text-based image retrieval, and query expansion. For any of these tasks the similarity on the concerned type of objects is essential. In this paper, we propose a framework to tackle these four tasks from a unified view. The essence of the framework is to estimate similarities by exploiting the interactions between objects of different modality. Experiments show that the proposed method can improve similarity estimation, and based on the improved similarity estimation, some simple methods can achieve better performances than some state-of-the-art techniques.