Multimodal pLSA on visual features and tags

Authors:
Stefan Romberg;Eva Hörster;Rainer Lienhart
Affiliations:
Multimedia Computing Lab, University of Augsburg, Augsburg, Germany;Multimedia Computing Lab, University of Augsburg, Augsburg, Germany;Multimedia Computing Lab, University of Augsburg, Augsburg, Germany
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 4
Cited 4

Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Matching words and pictures

The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2

Multimodal ranking for image search on community databases

Proceedings of the international conference on Multimedia information retrieval
Auto-tagging of images in non-english languages using tag language conversion

Proceedings of the international conference on Multimedia
Leveraging community metadata for multimodal image ranking

Multimedia Tools and Applications
High order pLSA for indexing tagged images

Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work studies a new approach for image retrieval on large-scale community databases. Our proposed system explores two different modalities: visual features and community-generated metadata, such as tags. We use topic models to derive a high-level representation appropriate for retrieval for each of our images in the database. We evaluate the proposed approach experimentally in a query-by-example retrieval task and compare our results to systems relying solely on visual features or tag features. It is shown that the proposed multimodal system outperforms the unimodal systems by approximately 36%.