Learning temporal coherent features through life-time sparsity

Authors:
Jost Tobias Springenberg;Martin Riedmiller
Affiliations:
Department of Computer Science, University of Freiburg, Freiburg, Germany;Department of Computer Science, University of Freiburg, Freiburg, Germany
Venue:
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part I
Year:
2012

Citing 9
Cited 0

Learning invariance from transformation sequences

Neural Computation
What is the goal of sensory coding?

Neural Computation
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories

CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Bilinear Sparse Coding for Invariant Vision

Neural Computation
Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces

Neural Computation
Dimensionality Reduction by Learning an Invariant Mapping

CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Learning to represent spatial transformations with factored higher-order boltzmann machines

Neural Computation
Learning methods for generic object recognition with invariance to pose and lighting

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Suitability of V1 energy models for object classification

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we consider the problem of unsupervised feature learning for spatio-temporal data streams, specifically video data. We focus on the problem of learning features invariant to image transformations and regard a video stream as a set of pairwise similiar images. Many existing methods dealing with the problem of invariant feature extraction either try to build a model of the transformations present in the data or achieve invariance by adding a penalty to a reconstruction loss term. In contrast to this, we propose to learn invariant features by directly optimizing the temporal coherence of a hidden, and possibly deep, representation. We find that our approach is both fast and capable of learning deep feature representations invariant to complex image transformations. We furthermore show that features learned using our approach can be used to improve object recognition performance in still images (Caltech-101, STL-10).