Image classification using spatial pyramid coding and visual word reweighting

Authors:
Chunjie Zhang;Jing Liu;Jinqiao Wang;Qi Tian;Changsheng Xu;Hanqing Lu;Songde Ma
Affiliations:
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China;University of Texas at San Antonio, One UTSA Circle, San Antonio Texas;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Venue:
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Year:
2010

Citing 15
Cited 1

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Object Recognition with Features Inspired by Visual Cortex

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Multiclass Object Recognition with Sparse, Localized Features

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 1
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Scene Classification Using a Hybrid Generative/Discriminative Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Randomized Clustering Forests for Image Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Supervised Learning of Quantizer Codebooks by Information Loss Minimization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Category sensitive codebook construction for object category recognition

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Visual Word Ambiguity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Adapted vocabularies for generic visual categorization

ECCV'06 Proceedings of the 9th European conference on Computer Vision - Volume Part IV

Undo the codebook bias by linear transformation for visual applications

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ignorance on spatial information and semantics of visual words becomes main obstacles in the bag-of-visual-words (BoW) method for image classification. To address the obstacles, we present an improved BoW representation using spatial pyramid coding (SPC) and visual word reweighting. In SPC procedure, we adopt the sparse coding technique to encode visual features with the spatial constraint. Visual features from the same spatial sub-region of images are collected to generate the visual vocabulary. Additionally, a relaxed but simple solution for semantic embedding into visual words is proposed. We relax the semantic embedding from ideal semantic correspondence to naive semantic purity of visual words, and reweight each visual word according to its semantic purity. Higher weights are given to semantically distinctive visual words, and lower weights to semantically general ones. Experiments on a public dataset demonstrate the effectiveness of the proposed method.