Learning dictionary on manifolds for image classification

Authors:
Bao-Di Liu;Yu-Xiong Wang;Yu-Jin Zhang;Bin Shen
Affiliations:
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China;Department of Electronic Engineering, Tsinghua University, Beijing 100084, China;Department of Electronic Engineering, Tsinghua University, Beijing 100084, China;Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
Venue:
Pattern Recognition
Year:
2013

Citing 12
Cited 0

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Creating Efficient Codebooks for Visual Recognition

ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Graph Embedding and Extensions: A General Framework for Dimensionality Reduction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Visual Word Ambiguity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Kernel sparse representation for image classification and face recognition

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part IV
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Graph Regularized Sparse Coding for Image Representation

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.01

Visualization

Abstract

At present, dictionary based models have been widely used in image classification. The image features are approximated as a linear combination of bases selected from the dictionary in a sparse space, resulting in compact patterns. The features applied to image classification usually reside on low dimensional manifolds embedded in a high dimensional ambient space; traditional sparse coding algorithm, however, does not consider this topological structure. It can be characterized naturally by linear coefficients that reconstruct each data point from its neighbors. One of the central issues here is how to determine the neighbors and learn the coefficients. In this paper, the geometrical structures are encoded in two situations. In simple cases when data points distribute on a single manifold, it is explicitly modeled by locally linear embedding algorithm combined with k-nearest neighbors. Nevertheless, in real-world scenarios, complex data points often lie on multiple manifolds. Sparse representation algorithm combined with k-nearest neighbors is instead utilized to construct the topological structures, because it is capable of approximating the data point by selecting its homogenous neighbors adaptively to guarantee the smoothness of each manifold. After obtaining the local fitting relationship, these two topological structures are then embedded into sparse coding algorithm as regularization terms to formulate the corresponding objective functions of dictionary learning on single manifold (DLSM) and dictionary learning on multiple manifolds (DLMM), respectively. Upon this, a coordinate descent scheme is proposed to solve the unified optimization problems. Experimental results on several benchmark data sets, such as Caltech-256, Caltech-101, Scene 15, and UIUC-Sports, show that our proposed algorithms equal or outperform other state-of-the-art image classification algorithms.