Language modeling for bag-of-visual words image categorization

Authors:
Pierre Tirilly;Vincent Claveau;Patrick Gros
Affiliations:
CNRS, Rennes, France;CNRS, Rennes, France;INRIA, Rennes, France
Venue:
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Year:
2008

Citing 16
Cited 12

A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning by probabilistic latent semantic analysis

Machine Learning
Theory of keyblock-based image retrieval

ACM Transactions on Information Systems (TOIS)
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
A Performance Evaluation of Local Descriptors

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discovering Objects and their Localization in Images

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
A Comparison of Affine Region Detectors

International Journal of Computer Vision
Scalable Recognition with a Vocabulary Tree

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Effective and efficient object-based image retrieval using visual phrases

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study

International Journal of Computer Vision
Visual language modeling for image classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
Evaluating bag-of-visual-words representations in scene classification

Proceedings of the international workshop on Workshop on multimedia information retrieval
Flexible spatial models for grouping local image features

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Scene classification via pLSA

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV

Semantics-preserving bag-of-words models for efficient image annotation

LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
Topic models for semantics-preserving video compression

Proceedings of the international conference on Multimedia information retrieval
Spatial relationships in visual graph modeling for image categorization

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Semantics-preserving bag-of-words models and applications

IEEE Transactions on Image Processing
From local features to global shape constraints: heterogeneous matching scheme for recognizing objects under serious background clutter

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
Visual content representation using semantically similar visual words

Expert Systems with Applications: An International Journal
A novel retrieval framework using classification, feature selection and indexing structure

MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
Visual graph modeling for scene recognition and mobile robot localization

Multimedia Tools and Applications
Toward a higher-level visual representation for content-based image retrieval

Multimedia Tools and Applications
Classification improvement of local feature vectors over the KNN algorithm

Multimedia Tools and Applications
Fast image copy detection approach based on local fingerprint defined visual words

Signal Processing
Medical image retrieval using bag of meaningful visual words: unsupervised visual vocabulary pruning with PLSA

Proceedings of the 1st ACM international workshop on Multimedia indexing and information retrieval for healthcare

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose two ways of improving image classification based on bag-of-words representation [25]. Two shortcomings of this representation are the loss of the spatial information of visual words and the presence of noisy visual words due to the coarseness of the vocabulary building process. On the one hand, we propose a new representation of images that goes further in the analogy with textual data: visual sentences, that allows us to "read" visual words in a certain order, as in the case of text. We can therefore consider simple spatial relations between words. We also present a new image classification scheme that exploits these relations. It is based on the use of language models, a very popular tool from speech and text analysis communities. On the other hand, we propose new techniques to eliminate useless words, one based on geometric properties of the keypoints, the other on the use of probabilistic Latent Semantic Analysis (pLSA). Experiments show that our techniques can significantly improve image classification, compared to a classical Support Vector Machine-based classification.