Multimodal image retrieval via Bayesian information fusion

  • Authors:
  • Rui Zhang;Ling Guan

  • Affiliations:
  • Ryerson Multimedia Research Laboratory, Ryerson University, Toronto, ON, Canada;Ryerson Multimedia Research Laboratory, Ryerson University, Toronto, ON, Canada

  • Venue:
  • ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a multimodal image retrieval framework integrating the information in both audio and visual domain via Bayesian decision-level fusion is proposed. In both domains, a statistical model for each semantic class is learned. Based on the Bayes' theorem, the a posteriori probability of each class given a query is calculated in the audio domain, which is propagated to the images classified into the corresponding semantic class in the visual domain. These probabilistic measures are utilized as the a priori probability in the overall framework, which is combined with the likelihood evaluated based on nearest neighbor content-based image retrieval. Through the Bayes' theorem again, the images are ranked based on their a posteriori probabilities given the audio and visual feature of a query. To further improve the system, we also propose a relevance feedback scheme in the audio domain. Experimental results demonstrate the advantage of the proposed method over the retrieval simply based on visual features.