Geometric $/ell$_p-norm feature pooling for image classification

Authors:
Jiashi Feng; Bingbing Ni; Qi Tian; Shuicheng Yan
Affiliations:
Dept. of Electr. & Comput. Eng., Nat. Univ. of Singapore, Singapore, Singapore;Adv. Digital Sci. Center, Illinois at Singapore Pte Ltd., Singapore, Singapore;Dept. of Comput. Sci., Univ. of Texas at San Antonio, San Antonio, TX, USA;Dept. of Electr. & Comput. Eng., Nat. Univ. of Singapore, Singapore, Singapore
Venue:
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2011

Citing 0
Cited 6

Spatial pooling of heterogeneous features for image applications

Proceedings of the 20th ACM international conference on Multimedia
Naive bayes image classification: beyond nearest neighbors

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
A classification-oriented dictionary learning model: Explicitly learning the particularity and commonality across categories

Pattern Recognition
Visual word spatial arrangement for image retrieval and classification

Pattern Recognition
Keep it simple and sparse: real-time action recognition

The Journal of Machine Learning Research
Kernel-based transition probability toward similarity measure for semi-supervised learning

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern visual classification models generally include a feature pooling step, which aggregates local features over the region of interest into a statistic through a certain spatial pooling operation. Two commonly used operations are the average and max poolings. However, recent theoretical analysis has indicated that neither of these two pooling techniques may be qualified to be optimal. Besides, we further reveal in this work that more severe limitations of these two pooling methods are from the unrecoverable loss of the spatial information during the statistical summarization and the underlying over-simplified assumption about the feature distribution. We aim to address these inherent issues in this work and generalize previous pooling methods as follows. We define a weighted $/ell$_p-norm spatial pooling function tailored for the class-specific feature spatial distribution. Moreover, a sensible prior for the feature spatial correlation is incorporated. Optimizing such pooling function towards optimal class separability yields a so-called geometric $/ell$_p-norm pooling (GLP) method. The described GLP method is capable of preserving the class-specific spatial/geometric information in the pooled features and significantly boosts the discriminating capability of the resultant features for image classification. Comprehensive evaluations on several image benchmarks demonstrate that the proposed GLP method can boost the image classification performance with a single type of feature to outperform or be comparable with the state-of-the-arts.