An Introduction to Variational Methods for Graphical Models
Machine Learning
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling Semantic Aspects for Cross-Media Image Indexing
IEEE Transactions on Pattern Analysis and Machine Intelligence
Image annotation via graph learning
Pattern Recognition
A New Baseline for Image Annotation
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Search Engines: Information Retrieval in Practice
Search Engines: Information Retrieval in Practice
A new approach to cross-modal multimedia retrieval
Proceedings of the international conference on Multimedia
Multiple Bernoulli relevance models for image and video annotation
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Mining social images with distance metric learning for automated image tagging
Proceedings of the fourth ACM international conference on Web search and data mining
Automated image annotation using global features and robust nonparametric density estimation
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
A probabilistic model for multimodal hash function learning
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Generalized Multiview Analysis: A discriminative latent space
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Large-Margin Predictive Latent Subspace Learning for Multiview Data Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Online multi-modal distance learning for scalable multimedia retrieval
Proceedings of the sixth ACM international conference on Web search and data mining
MedLDA: maximum margin supervised topic models
The Journal of Machine Learning Research
Hi-index | 0.00 |
Learning with multi-modal data is at the core of many multimedia applications, such as cross-modal retrieval and image annotation. In this paper, we present a nonparametric Bayesian approach to learning upstream supervised topic models for analyzing multi-modal data. Our model develops a compound nonparametric Bayesian multi-modal prior to describe the correlation structure of data both within each individual modality and between different modalities. It extends the hierarchical Dirichlet process (HDP) through incorporating upstream supervised response variables and values of latent functions under Gaussian process (GP). Upstream responses shared by data from multiple modalities are beneficial for discriminatively training and GP allows flexible structure learning of correlations. Hence, our model inherits the automatic determination of the number of topics from HDP, structure learning from GP and enhanced predictive capacity from upstream supervision. We also provide efficient variational inference and prediction algorithms. Empirical studies demonstrate superior performances on several benchmark datasets compared with previous competitors.