Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Support vector machine active learning for image retrieval
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Indexing and Retrieval of Audio: A Survey
Multimedia Tools and Applications
The Journal of Machine Learning Research
A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Image retrieval: Ideas, influences, and trends of the new age
ACM Computing Surveys (CSUR)
Search strategies in multimodal image retrieval
Proceedings of the second international symposium on Information interaction in context
Multimodal photo annotation and retrieval on a mobile phone
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
A collaborative Bayesian image retrieval framework
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
An interactive approach for CBIR using a network of radial basis functions
IEEE Transactions on Multimedia
A unified framework for image retrieval using keyword and visual features
IEEE Transactions on Image Processing
Learning a semantic space from user's relevance feedback for image retrieval
IEEE Transactions on Circuits and Systems for Video Technology
A soft relevance framework in content-based image retrieval systems
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
In this paper, a multimodal image retrieval framework integrating the information in both audio and visual domain via Bayesian decision-level fusion is proposed. In both domains, a statistical model for each semantic class is learned. Based on the Bayes' theorem, the a posteriori probability of each class given a query is calculated in the audio domain, which is propagated to the images classified into the corresponding semantic class in the visual domain. These probabilistic measures are utilized as the a priori probability in the overall framework, which is combined with the likelihood evaluated based on nearest neighbor content-based image retrieval. Through the Bayes' theorem again, the images are ranked based on their a posteriori probabilities given the audio and visual feature of a query. To further improve the system, we also propose a relevance feedback scheme in the audio domain. Experimental results demonstrate the advantage of the proposed method over the retrieval simply based on visual features.