Learning an image-word embedding for image auto-annotation on the nonlinear latent space

Authors:
Wei Liu;Xiaoou Tang
Affiliations:
Chinese University of Hong Kong, Shatin, Hong Kong;Microsoft Research Asia, Beijing, China
Venue:
Proceedings of the 13th annual ACM international conference on Multimedia
Year:
2005

Citing 8
Cited 3

Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Matching words and pictures

The Journal of Machine Learning Research
On image auto-annotation with latent space models

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
PLSA-based image auto-annotation: constraining the latent space

Proceedings of the 12th annual ACM international conference on Multimedia
CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines

IEEE Transactions on Circuits and Systems for Video Technology

Toward bridging the annotation-retrieval gap in image search by a generative modeling approach

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Tagging over time: real-world image annotation by lightweight meta-learning

Proceedings of the 15th international conference on Multimedia
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Latent Semantic Analysis (LSA) has shown encouraging performance for the problem of unsupervised image automatic annotation. LSA conducts annotation by keywords propagation on a linear Latent Space, which accounts for the underlying semantic structure of word and image features. In this paper, we formulate a more general nonlinear model, called Nonlinear Latent Space model, to reveal the latent variables of word and visual features more precisely. Instead of the basic propagation strategy, we present a novel inference strategy for image annotation via Image-Word Embedding (IWE). IWE simultaneously embeds images and words and captures the dependencies between them from a probabilistic viewpoint. Experiments show that IWE-based annotation on the nonlinear latent space outperforms previous unsupervised annotation methods.