Visual language modeling for image classification

Authors:
Lei Wu;Mingjing Li;Zhiwei Li;Wei-Ying Ma;Nenghai Yu
Affiliations:
University of Science and Technology of China, Hefei, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China;University of Science and Technology of China, Hefei, China
Venue:
Proceedings of the international workshop on Workshop on multimedia information retrieval
Year:
2007

Citing 16
Cited 12

A statistical approach to machine translation

Computational Linguistics
A framework for multiple-instance learning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Affine Invariant Interest Point Detector

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
Indoor-Outdoor Image Classification

CAIVD '98 Proceedings of the 1998 International Workshop on Content-Based Access of Image and Video Databases (CAIVD '98)
Object Recognition from Local Scale-Invariant Features

ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Latent dirichlet allocation

The Journal of Machine Learning Research
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Random Subwindows for Robust Image Classification

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
A Sparse Support Vector Machine Approach to Region-Based Image Categorization

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Modeling Scenes with Local Descriptors and Latent Aspects

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Learning Object Categories from Google"s Image Search

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Effective and efficient object-based image retrieval using visual phrases

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Combining naive bayes and n-gram language models for text classification

ECIR'03 Proceedings of the 25th European conference on IR research

Language modeling for bag-of-visual words image categorization

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Flickr distance

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Boosting relative spaces for categorizing objects with large intra-class variation

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Learning to tag

Proceedings of the 18th international conference on World wide web
Semantics-preserving bag-of-words models for efficient image annotation

LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
Visual language model for face clustering in consumer photos

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Multiple feature fusion for social media applications

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Semantics-preserving bag-of-words models and applications

IEEE Transactions on Image Processing
Travelmedia: An intelligent management system for media captured in travel

Journal of Visual Communication and Image Representation
Image tagging by exploiting feature correlation

ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation
Generating visual concept network from large-scale weakly-tagged images

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Visual graph modeling for scene recognition and mobile robot localization

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although it has been studied for many years, image classification is still a challenging problem. In this paper, we propose a visual language modeling method for content-based image classification. It transforms each image into a matrix of visual words, and assumes that each visual word is conditionally dependent on its neighbors. For each image category, a visual language model is constructed using a set of training images, which captures both the co-occurrence and proximity information of visual words. According to how many neighbors are taken in consideration, three kinds of language models can be trained, including unigram, bigram and trigram, each of which corresponds to a different level of model complexity. Given a test image, its category is determined by estimating how likely it is generated under a specific category. Compared with traditional methods that are based on bag-of-words models, the proposed method can utilize the spatial correlation of visual words effectively in image classification. In addition, we propose to use the absent words, which refer to those appearing frequently in a category but not in the target image, to help image classification. Experimental results show that our method can achieve comparable accuracy while performing classification much more quickly.