Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Probabilistic Discriminative Kernel Classifiers for Multi-class Problems
Proceedings of the 23rd DAGM-Symposium on Pattern Recognition
An adaptive graph model for automatic image annotation
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Scalable training of L1-regularized log-linear models
Proceedings of the 24th international conference on Machine learning
Proceedings of the 6th ACM international conference on Image and video retrieval
Correlative multi-label video annotation
Proceedings of the 15th international conference on Multimedia
A discrete direct retrieval model for image and video retrieval
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
A New Baseline for Image Annotation
ECCV '08 Proceedings of the 10th European Conference on Computer Vision: Part III
Multiple Bernoulli relevance models for image and video annotation
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
Recent progress on Automatic Image Annotation (AIA) is achieved by either exploiting low level visual features or high level semantic context. Integrating these two paradigms to further leverage the performance of AIA is promising. However, very few previous works have studied this issue in a unified framework. In this paper, we propose a unified model based on Conditional Random Fields (CRF), which establishes tight interaction between visual features and semantic context. In particular, Kernelized Logistic Regression (KLR) with multiple visual distance learning is embedded into the CRF framework. We introduce L1 and L2 regularization terms into the unified learning process for the distance learning and the parameters penalty respectively. The experiments are conducted on two benchmarks: Corel and TRECVID-2005 data sets for evaluation. The experimental results show that, compared with the state-of-the-art methods, the unified model achieves significant improvement on annotation performance and shows more robustness with increasing number of various visual features.