Learning to represent spatial transformations with factored higher-order boltzmann machines

Authors:
Roland Memisevic;Geoffrey E. Hinton
Affiliations:
-;-
Venue:
Neural Computation
Year:
2010

Citing 11
Cited 12

Higher-order Boltzmann machines

AIP Conference Proceedings 151 on Neural Networks for Computing
Learning invariance from transformation sequences

Neural Computation
Neural routing circuits for forming invariant representations of visual objects

Neural routing circuits for forming invariant representations of visual objects
Learning Lie groups for invariant visual perception

Proceedings of the 1998 conference on Advances in neural information processing systems II
Training products of experts by minimizing contrastive divergence

Neural Computation
Multilinear Analysis of Image Ensembles: TensorFaces

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
Efficient Encoding of Natural Time Varying Images Produces Oriented Space-Time Receptive Fields

Efficient Encoding of Natural Time Varying Images Produces Oriented Space-Time Receptive Fields
Bilinear Sparse Coding for Invariant Vision

Neural Computation
Separating Style and Content with Bilinear Models

Neural Computation
Learning the Lie Groups of Visual Invariance

Neural Computation
Non-linear latent factor models for revealing structure in high-dimensional data

Non-linear latent factor models for revealing structure in high-dimensional data

Convolutional learning of spatio-temporal features

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Two Distributed-State Models For Generating High-Dimensional Time Series

The Journal of Machine Learning Research
Transformation equivariant Boltzmann machines

ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Transforming auto-encoders

ICANN'11 Proceedings of the 21th international conference on Artificial neural networks - Volume Part I
Learning image transformations without training examples

ISVC'11 Proceedings of the 7th international conference on Advances in visual computing - Volume Part II
Bilinear deep learning for image classification

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Self-organization of topographic bilinear networks for invariant recognition

Neural Computation
Gated boltzmann machine in texture modeling

ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Self-Avoiding Random Dynamics on Integer Complex Systems

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special Issue on Monte Carlo Methods in Statistics
Learning temporal coherent features through life-time sparsity

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
A Fully Pipelined FPGA Architecture of a Factored Restricted Boltzmann Machine Artificial Neural Network

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

International Journal of Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced three-way multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a three-dimensional interaction tensor. We describe a low-rank approximation to this interaction tensor that uses a sum of factors, each of which is a three-way outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.