Towards compact topical descriptors

Authors:
Jie Chen
Affiliations:
National Engineering Lab for Video Technology, Peking University, Beijing, China
Venue:
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Year:
2012

Citing 0
Cited 3

Learning from mobile contexts to minimize the mobile location search latency

Image Communication
Weakly supervised codebook learning by iterative label propagation with graph quantization

Signal Processing
Sparse online topic models

Proceedings of the 22nd international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a Compact Topical Descriptor to learn a compact yet discriminative image signature from the reference image corpus. This descriptor is deployed over the well used bag-of-words image histogram, with two merits over the traditional topical features: First, we propose to directly control the topical sparsity to achieve the descriptor compactness. Second, we ensure the descriptor discriminability by minimizing the bag-of-words reconstruction errors during the topical histogram encoding. To this end, we have a generative viewpoint of the topical feature extraction, which is estimated as a sparse MAP estimation over the original bag-of-words. We learn such estimation by a bi-convex optimization, iterating between both hierarchical sparse coding from words to topical histograms and dictionary learning of the corresponding word-to-topic transform. Especially, supervised labels such as image ranking list can be also incorporated into our descriptor learning paradigm. We quantize our performance in both Im-ageNet 10K and NUS-WIDE, with comparisons to bag-of-words, LDA, miniBoF, and Aggregated Local Descriptors. In practice, we also implement our descriptor for a low bit rate mobile visual search application, i.e. sending compact descriptors instead of the image to reduce the query delivery latency. Our descriptor has significantly outperformed the state-of-the-art compact descriptors by quantitative evaluations over 10 million reference images.