Nonparametric bayesian upstream supervised multi-modal topic models

Authors:
Renjie Liao;Jun Zhu;Zengchang Qin
Affiliations:
The Chinese University of Hong Kong, HongKong, Hong Kong;Tsinghua University, Beijing, China;Beihang University, Beijing, China
Venue:
Proceedings of the 7th ACM international conference on Web search and data mining
Year:
2014

Citing 21
Cited 0

An Introduction to Variational Methods for Graphical Models

Machine Learning
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling Semantic Aspects for Cross-Media Image Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image annotation via graph learning

Pattern Recognition
A New Baseline for Image Annotation

ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Search Engines: Information Retrieval in Practice

Search Engines: Information Retrieval in Practice
A new approach to cross-modal multimedia retrieval

Proceedings of the international conference on Multimedia
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Mining social images with distance metric learning for automated image tagging

Proceedings of the fourth ACM international conference on Web search and data mining
Automated image annotation using global features and robust nonparametric density estimation

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
A probabilistic model for multimodal hash function learning

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Generalized Multiview Analysis: A discriminative latent space

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Large-Margin Predictive Latent Subspace Learning for Multiview Data Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Online multi-modal distance learning for scalable multimedia retrieval

Proceedings of the sixth ACM international conference on Web search and data mining
MedLDA: maximum margin supervised topic models

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning with multi-modal data is at the core of many multimedia applications, such as cross-modal retrieval and image annotation. In this paper, we present a nonparametric Bayesian approach to learning upstream supervised topic models for analyzing multi-modal data. Our model develops a compound nonparametric Bayesian multi-modal prior to describe the correlation structure of data both within each individual modality and between different modalities. It extends the hierarchical Dirichlet process (HDP) through incorporating upstream supervised response variables and values of latent functions under Gaussian process (GP). Upstream responses shared by data from multiple modalities are beneficial for discriminatively training and GP allows flexible structure learning of correlations. Hence, our model inherits the automatic determination of the number of topics from HDP, structure learning from GP and enhanced predictive capacity from upstream supervision. We also provide efficient variational inference and prediction algorithms. Empirical studies demonstrate superior performances on several benchmark datasets compared with previous competitors.