Learning invariance from transformation sequences
Neural Computation
What is the goal of sensory coding?
Neural Computation
CVPRW '04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 12 - Volume 12
Bilinear Sparse Coding for Invariant Vision
Neural Computation
Dimensionality Reduction by Learning an Invariant Mapping
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Learning methods for generic object recognition with invariance to pose and lighting
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Suitability of V1 energy models for object classification
Neural Computation
Hi-index | 0.00 |
In this paper, we consider the problem of unsupervised feature learning for spatio-temporal data streams, specifically video data. We focus on the problem of learning features invariant to image transformations and regard a video stream as a set of pairwise similiar images. Many existing methods dealing with the problem of invariant feature extraction either try to build a model of the transformations present in the data or achieve invariance by adding a penalty to a reconstruction loss term. In contrast to this, we propose to learn invariant features by directly optimizing the temporal coherence of a hidden, and possibly deep, representation. We find that our approach is both fast and capable of learning deep feature representations invariant to complex image transformations. We furthermore show that features learned using our approach can be used to improve object recognition performance in still images (Caltech-101, STL-10).