Learning image representations from the pixel level via hierarchical sparse coding

Authors:
Kai Yu; Yuanqing Lin;J. Lafferty
Affiliations:
NEC Labs. America, Cupertino, CA, USA;NEC Labs. America, Cupertino, CA, USA;Carnegie Mellon Univ., Pittsburgh, PA, USA
Venue:
CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Year:
2011

Citing 0
Cited 4

Multi-channel shape-flow kernel descriptors for robust video event detection and retrieval

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Sparse coding based visual tracking: Review and experimental comparison

Pattern Recognition
Recognizing architecture styles by hierarchical sparse coding of blocklets

Information Sciences: an International Journal
Learning group-based dictionaries for discriminative image representation

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a method for learning image representations using a two-layer sparse coding scheme at the pixel level. The first layer encodes local patches of an image. After pooling within local regions, the first layer codes are then passed to the second layer, which jointly encodes signals from the region. Unlike traditional sparse coding methods that encode local patches independently, this approach accounts for high-order dependency among patterns in a local image neighborhood. We develop algorithms for data encoding and codebook learning, and show in experiments that the method leads to more invariant and discriminative image representations. The algorithm gives excellent results for hand-written digit recognition on MNIST and object recognition on the Caltech101 benchmark. This marks the first time that such accuracies have been achieved using automatically learned features from the pixel level, rather than using hand-designed descriptors.