Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
A fast learning algorithm for deep belief nets
Neural Computation
Semantic Modeling of Natural Scenes for Content-Based Image Retrieval
International Journal of Computer Vision
Self-taught learning: transfer learning from unlabeled data
Proceedings of the 24th international conference on Machine learning
An efficient projection for l1, ∞ regularization
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
SIAM Journal on Imaging Sciences
Online Learning for Matrix Factorization and Sparse Coding
The Journal of Machine Learning Research
Efficient object category recognition using classemes
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Objects as attributes for scene classification
ECCV'10 Proceedings of the 11th European conference on Trends and Topics in Computer Vision - Volume Part I
Proceedings of the 22nd international conference on World Wide Web
Hi-index | 0.00 |
Robust image representations such as classemes [1], Object Bank (OB) [2], spatial pyramid representation(SPM) [3] have been proposed, showing superior performance in various high level visual recognition tasks. Our work is motivated by the need of exploring rich structural information encoded by these image representations. In this paper, we propose a novel Multi-Level Structured Image Coding approach to uncover the structure embedded in representations with rich regular structural information by learning a structured dictionary from it. Specifically, we choose Object Bank [2] to demonstrate our algorithm since it encodes both semantics and spatial location as structural information. By using the learned structured dictionary from Object Bank, we can compute a lower-dimensional and more compact encoding of the image features while preserving and accentuating the rich semantic and spatial information of OB. Our framework is an unsupervised method based on minimizing the reconstruction error of the image and object codes, with an innovative multi-level structural regularization scheme. The object dictionary and the image code obtained by our model offer intriguing intuition of real-world image structures while preserving informative structure of the original OB. We show that our more compact representation outperforms several state-of-the-art representations (including the original OB) on a wide range of high-level visual tasks such as scene classification, image retrieval and annotation.