Multi-modal image annotation with multi-instance multi-label LDA

Authors:
Cam-Tu Nguyen;De-Chuan Zhan;Zhi-Hua Zhou
Affiliations:
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China;National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 13
Cited 0

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
The author-topic model for authors and documents

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Using the forest to see the trees: exploiting context for visual object detection and localization

Communications of the ACM
Multi-label learning by exploiting label dependency

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating Color Descriptors for Object and Scene Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
ImageCLEF: Experimental Evaluation in Visual Information Retrieval

ImageCLEF: Experimental Evaluation in Visual Information Retrieval
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Multi-instance multi-label learning

Artificial Intelligence
Statistical topic models for multi-label document classification

Machine Learning
Rank-loss support instance machines for MIML instance annotation

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies the problem of image annotation in a multi-modal setting where both visual and textual information are available. We propose Multimodal Multi-instance Multi-label Latent Dirichlet Allocation (M3LDA), where the model consists of a visual-label part, a textual-label part and a label-topic part. The basic idea is that the topic decided by the visual information and the topic decided by the textual information should be consistent, leading to the correct label assignment. Particularly, M3LDA is able to annotate image regions, thus provides a promising way to understand the relation between input patterns and output semantics. Experiments on Corel5K and ImageCLEF validate the effectiveness of the proposed method.