Contextual pooling in image classification

Authors:
Zifeng Wu;Yongzhen Huang;Liang Wang;Tieniu Tan
Affiliations:
National Lab of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Lab of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Lab of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, Beijing, China;National Lab of Pattern Recognition Institute of Automation, Chinese Academy of Sciences, Beijing, China
Venue:
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Year:
2012

Citing 12
Cited 0

Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Visual Word Ambiguity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Building contextual visual vocabulary for large-scale image applications

Proceedings of the international conference on Multimedia
Object classification using heterogeneous co-occurrence features

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Improving the fisher kernel for large-scale image classification

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
Image classification using super-vector coding of local image descriptors

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part V
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Learning object relationships via graph-based context model

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Compact correlation coding for visual object categorization

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Visual word disambiguation by semantic contexts

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Ask the locals: Multi-way local pooling for image recognition

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

The original bag-of-words (BoW) model in terms of image classification treats each local feature independently, and thus ignores the spatial relationships between a feature and its neighboring features, namely, the feature's context. However, our intuition and empirical studies tell the importance of such spatial information. Although the global spatial information can be captured with the spatial pyramid matching scheme, the subject of capturing local spatial relationships between features is still open. In this paper, we propose a new method to embed such local spatial (context) information into the BoW model. A vector reflecting context information is firstly extracted along with each feature, context patterns are then code-specifically trained, and thus the context information is elegantly embedded into the BoW model by contextual pooling according to different context patterns. Extensive experiments on the PASCAL VOC 2007 dataset show that our method greatly enhances the BoW model, and achieves the state-of-the-art performance.