Boosting multimodal semantic understanding by local similarity adaptation and global correlation propagation

  • Authors:
  • Hong Zhang;Xiaoli Liu

  • Affiliations:
  • College of Computer Science & Technology, Wuhan University of Science & Technology, Wuhan;College of Computer Science & Technology, Wuhan University of Science & Technology, Wuhan

  • Venue:
  • PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

An important trend in multimedia semantic understanding is the utilization and support of multimodal data which are heterogeneous in low-level features, such as image and audio. The main challenge is how to measure different kinds of correlations among multimodal data. In this paper, we propose a novel approach to boost multimodal semantic understanding from local and global perspectives. First, cross-media correlation between images and audio clips is estimated with Kernel Canonical Correlation Analysis; secondly, a multimodal graph is constructed to enable global correlation propagation with adapted intra-media similarity; then cross-media retrieval algorithm is discussed as an application of our approach. A prototype system is developed to demonstrate the feasibility and capability. Experimental results are encouraging and show that the performance of our approach is effective.