Bilinear deep learning for image classification

Authors:
Sheng-hua Zhong;Yan Liu;Yang Liu
Affiliations:
The Hong Kong Polytechnic University, Hong Kong, Hong Kong;The Hong Kong Polytechnic University, Hong Kong, Hong Kong;The Hong Kong Polytechnic University, Hong Kong, Hong Kong
Venue:
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Year:
2011

Citing 28
Cited 4

Information processing in dynamical systems: foundations of harmony theory

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Machine Learning

Machine Learning
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Training products of experts by minimizing contrastive divergence

Neural Computation
The CMU Pose, Illumination, and Expression Database

IEEE Transactions on Pattern Analysis and Machine Intelligence
A fast learning algorithm for deep belief nets

Neural Computation
The Time Course of Visual Processing: From Early Perception to Decision-Making

Journal of Cognitive Neuroscience
Graph Embedding and Extensions: A General Framework for Dimensionality Reduction

IEEE Transactions on Pattern Analysis and Machine Intelligence
Large Scale Transductive SVMs

The Journal of Machine Learning Research
Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis

The Journal of Machine Learning Research
An empirical evaluation of deep architectures on problems with many factors of variation

Proceedings of the 24th international conference on Machine learning
Backpropagation applied to handwritten zip code recognition

Neural Computation
Deep learning via semi-supervised embedding

Proceedings of the 25th international conference on Machine learning
Randomized Clustering Forests for Image Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence
Deep networks for image retrieval on large-scale databases

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Incorporating camera metadata for attended region detection and consumer photo classification

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Automatic sports genre categorization and view-type classification over large-scale dataset

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Learning to represent spatial transformations with factored higher-order boltzmann machines

Neural Computation
Affective image classification using features inspired by psychology and art theory

Proceedings of the international conference on Multimedia
Landmark image classification using 3D point clouds

Proceedings of the international conference on Multimedia
Age classification for pose variant and occluded faces

Proceedings of the international conference on Multimedia
Image classification using the web graph

Proceedings of the international conference on Multimedia
Hierarchical image feature extraction and classification

Proceedings of the international conference on Multimedia
Sonify your face: facial expressions for sound generation

Proceedings of the international conference on Multimedia
A deep-learning model-based and data-driven hybrid architecture for image annotation

Proceedings of the international workshop on Very-large-scale multimedia corpus, mining and retrieval
Convolutional learning of spatio-temporal features

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI

Semiconducting bilinear deep learning for incomplete image recognition

Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Self-paced dictionary learning for image classification

Proceedings of the 20th ACM international conference on Multimedia
CISC: clustered image search by conceptualization

Proceedings of the 16th International Conference on Extending Database Technology
Online multimodal deep similarity learning with application to image retrieval

Proceedings of the 21st ACM international conference on Multimedia

Quantified Score

Hi-index	0.00

Visualization

Abstract

Image classification is a well-known classical problem in multimedia content analysis. This paper proposes a novel deep learning model called bilinear deep belief network (BDBN) for image classification. Unlike previous image classification models, BDBN aims to provide human-like judgment by referencing the architecture of the human visual system and the procedure of intelligent perception. Therefore, the multi-layer structure of the cortex and the propagation of information in the visual areas of the brain are realized faithfully. Unlike most existing deep models, BDBN utilizes a bilinear discriminant strategy to simulate the "initial guess" in human object recognition, and at the same time to avoid falling into a bad local optimum. To preserve the natural tensor structure of the image data, a novel deep architecture with greedy layer-wise reconstruction and global fine-tuning is proposed. To adapt real-world image classification tasks, we develop BDBN under a semi-supervised learning framework, which makes the deep model work well when labeled images are insufficient. Comparative experiments on three standard datasets show that the proposed algorithm outperforms both representative classification models and existing deep learning techniques. More interestingly, our demonstrations show that the proposed BDBN works consistently with the visual perception of humans.