Multi-feature pLSA for combining visual features in image annotation

Authors:
Rui Zhang;Lei Zhang;Xin-Jing Wang;Ling Guan
Affiliations:
Ryerson University, Toronto, ON, Canada;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Ryerson University, Toronto, ON, Canada
Venue:
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Year:
2011

Citing 9
Cited 0

Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Matching words and pictures

The Journal of Machine Learning Research
Scale & Affine Invariant Interest Point Detectors

International Journal of Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Scene Classification Using a Hybrid Generative/Discriminative Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
Annotating Images by Mining Image Search Results

IEEE Transactions on Pattern Analysis and Machine Intelligence
Multilayer pLSA for multimodal image retrieval

Proceedings of the ACM International Conference on Image and Video Retrieval
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study in this paper the problem of combining low-level visual features for image region annotation. The problem is tackled with a novel method that combines texture and color features via a mixture model of their joint distribution. The structure of the presented model can be considered as an extension of the probabilistic latent semantic analysis (pLSA) in that it handles data from two different visual feature domains by attaching one more leaf node to the graphical structure of the original pLSA. Therefore, the proposed approach is referred to as multi-feature pLSA (MF-pLSA). The supervised paradigm is adopted to classify a new image region into one of a few pre-defined object categories using the MF-pLSA. To evaluate the performance, the VOC2009 and LabelMe databases were employed in our experiments, along with various experimental settings in terms of the number of visual words and mixture components. Evaluated based on the average recall and precision, the MF-pLSA is demonstrated superior to seven other approaches, including other schemes for visual feature combination.