Modeling continuous visual features for semantic image annotation and retrieval

Authors:
Zhixin Li;Zhiping Shi;Xi Liu;Zhongzhi Shi
Affiliations:
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China and College of Computer Science and Information Technolo ...;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
Venue:
Pattern Recognition Letters
Year:
2011

Citing 18
Cited 3

Texture Features for Browsing and Retrieval of Image Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Spatial Color Indexing and Applications

International Journal of Computer Vision
Content-Based Image Retrieval at the End of the Early Years

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Automatic image annotation and retrieval using cross-media relevance models

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Modeling annotated data

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Latent dirichlet allocation

The Journal of Machine Learning Research
Matching words and pictures

The Journal of Machine Learning Research
Test Data Likelihood for PLSA Models

Information Retrieval
A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling Semantic Aspects for Cross-Media Image Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
Modeling latent aspects for automatic image annotation

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Multiple Bernoulli relevance models for image and video annotation

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines

IEEE Transactions on Circuits and Systems for Video Technology

The effectiveness of image features based on fractal image coding for image annotation

Expert Systems with Applications: An International Journal
Learning semantic concepts from image database with hybrid generative/discriminative approach

Engineering Applications of Artificial Intelligence
Effective automatic image annotation via integrated discriminative and generative models

Information Sciences: an International Journal

Quantified Score

Hi-index	0.10

Visualization

Abstract

Automatic image annotation has become an important and challenging problem due to the existence of semantic gap. In this paper, we firstly extend probabilistic latent semantic analysis (PLSA) to model continuous quantity. In addition, corresponding Expectation-Maximization (EM) algorithm is derived to determine the model parameters. Furthermore, in order to deal with the data of different modalities in terms of their characteristics, we present a semantic annotation model which employs continuous PLSA and standard PLSA to model visual features and textual words respectively. The model learns the correlation between these two modalities by an asymmetric learning approach and then it can predict semantic annotation precisely for unseen images. Finally, we compare our approach with several state-of-the-art approaches on the Corel5k and Corel30k datasets. The experiment results show that our approach performs more effectively and accurately.