Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Using the forest to see the trees: exploiting context for visual object detection and localization
Communications of the ACM
Multi-label learning by exploiting label dependency
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating Color Descriptors for Object and Scene Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
ImageCLEF: Experimental Evaluation in Visual Information Retrieval
ImageCLEF: Experimental Evaluation in Visual Information Retrieval
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Multi-instance multi-label learning
Artificial Intelligence
Statistical topic models for multi-label document classification
Machine Learning
Rank-loss support instance machines for MIML instance annotation
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
This paper studies the problem of image annotation in a multi-modal setting where both visual and textual information are available. We propose Multimodal Multi-instance Multi-label Latent Dirichlet Allocation (M3LDA), where the model consists of a visual-label part, a textual-label part and a label-topic part. The basic idea is that the topic decided by the visual information and the topic decided by the textual information should be consistent, leading to the correct label assignment. Particularly, M3LDA is able to annotate image regions, thus provides a promising way to understand the relation between input patterns and output semantics. Experiments on Corel5K and ImageCLEF validate the effectiveness of the proposed method.