Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval

  • Authors:
  • Hong Zhang;Yun Liu;Zhigang Ma

  • Affiliations:
  • College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan 430081, China and State Key Laboratory of Software Engineering, Wuhan University, 430072, China;School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, 639798 Singapore, Singapore;Department of Information Engineering and Computer Science, University of Trento, 38123, Italy

  • Venue:
  • Neurocomputing
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

Cross-media retrieval focuses on searching multimedia data of different modalities with content-based methods. However, most of those methods are designed for multimedia retrieval in single modality, such as image retrieval and audio retrieval. Though a few work has focused on cross-media retrieval, the performance is yet to be satisfactory and the potential of using cross-media retrieval for boosted retrieval performance remains largely unexplored. Hence, in this paper, we propose a novel cross-media retrieval approach for general multimedia data, such as image and audio. First, image and audio samples are mapped into an isomorphic feature subspace with kernel-based method; second, multimedia semantics is learned from inherent feature correlation by local linear regression; also a graph model is constructed to utilize external knowledge from relevance feedback; then we build a unified objective function integrating inherent and external learning results, and by solving the objective function we calculate a multimodal semantic space where cross-media retrieval among image and audio is enabled. Extensive experiments have validated the proposed methods with encouraging results.