Multi-Level structured image coding on high-dimensional image representation

Authors:
Li-Jia Li;Jun Zhu;Hao Su;Eric P. Xing;Li Fei-Fei
Affiliations:
Computer Science Department, Stanford University and Yahoo! Research;Machine Learning Department, Carnegie Mellon University and Tsinghua University, China;Computer Science Department, Stanford University;Machine Learning Department, Carnegie Mellon University;Computer Science Department, Stanford University
Venue:
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Year:
2012

Citing 10
Cited 1

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A fast learning algorithm for deep belief nets

Neural Computation
Semantic Modeling of Natural Scenes for Content-Based Image Retrieval

International Journal of Computer Vision
Self-taught learning: transfer learning from unlabeled data

Proceedings of the 24th international conference on Machine learning
An efficient projection for l1, ∞ regularization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
Online Learning for Matrix Factorization and Sparse Coding

The Journal of Machine Learning Research
Efficient object category recognition using classemes

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Objects as attributes for scene classification

ECCV'10 Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part I

Sparse online topic models

Proceedings of the 22nd international conference on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Robust image representations such as classemes [1], Object Bank (OB) [2], spatial pyramid representation(SPM) [3] have been proposed, showing superior performance in various high level visual recognition tasks. Our work is motivated by the need of exploring rich structural information encoded by these image representations. In this paper, we propose a novel Multi-Level Structured Image Coding approach to uncover the structure embedded in representations with rich regular structural information by learning a structured dictionary from it. Specifically, we choose Object Bank [2] to demonstrate our algorithm since it encodes both semantics and spatial location as structural information. By using the learned structured dictionary from Object Bank, we can compute a lower-dimensional and more compact encoding of the image features while preserving and accentuating the rich semantic and spatial information of OB. Our framework is an unsupervised method based on minimizing the reconstruction error of the image and object codes, with an innovative multi-level structural regularization scheme. The object dictionary and the image code obtained by our model offer intriguing intuition of real-world image structures while preserving informative structure of the original OB. We show that our more compact representation outperforms several state-of-the-art representations (including the original OB) on a wide range of high-level visual tasks such as scene classification, image retrieval and annotation.