Efficient learning of sparse, distributed, convolutional feature representations for object recognition

Authors:
Kihyuk Sohn; Dae Yon Jung;Honglak Lee;Alfred O. Hero
Affiliations:
Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, 48109, USA;Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, 48109, USA;Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, 48109, USA;Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, 48109, USA
Venue:
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Year:
2011

Citing 0
Cited 3

Unveiling the multimedia unconscious: implicit cognitive processes and multimedia content analysis

Proceedings of the 21st ACM international conference on Multimedia
Kernel-based transition probability toward similarity measure for semi-supervised learning

Pattern Recognition
Modeling response properties of V2 neurons using a hierarchical K-means model

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Informative image representations are important in achieving state-of-the-art performance in object recognition tasks. Among feature learning algorithms that are used to develop image representations, restricted Boltzmann machines (RBMs) have good expressive power and build effective representations. However, the difficulty of training RBMs has been a barrier to their wide use. To address this difficulty, we show the connections between mixture models and RBMs and present an efficient training method for RBMs that utilize these connections. To the best of our knowledge, this is the first work showing that RBMs can be trained with almost no hyperparameter tuning to provide classification performance similar to or significantly better than mixture models (e.g., Gaussian mixture models). Along with this efficient training, we evaluate the importance of convolutional training that can capture a larger spatial context with less redundancy, as compared to non-convolutional training. Overall, our method achieves state-of-the-art performance on both Caltech 101 / 256 datasets using a single type of feature.