Cross-modal correlation learning for clustering on image-audio dataset

  • Authors:
  • Hong Zhang;Yueting Zhuang;Fei Wu

  • Affiliations:
  • Zhejiang University, Hangzhou, China & Wuhan University of Science & Technology, Wuhan, China;Zhejiang University, Hangzhou, China;Zhejiang University, Hangzhou, China

  • Venue:
  • Proceedings of the 15th international conference on Multimedia
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is interesting and challenging to explore correlations between different datasets and utilize such correlations for the clustering on these datasets. Cross-modal correlation between images and audios can help identify images (or audios) of certain semantics. However, the heterogeneous problem makes it difficult to learn cross-modal correlation between visual and auditory features. In this paper, we analyze canonical correlation between feature matrices of images and audios during subspace mapping; then we design correlation-based similarity reinforcement for images and audios; thirdly we implement image clustering and audio clustering with affinity propagation. Experiment results on image-audio dataset are encouraging and show that the performance of our approach is effective. We give an interesting application of querying images by audio examples.