A fast learning algorithm for deep belief nets
Neural Computation
An empirical evaluation of deep architectures on problems with many factors of variation
Proceedings of the 24th international conference on Machine learning
Extracting and composing robust features with denoising autoencoders
Proceedings of the 25th international conference on Machine learning
Deep learning via semi-supervised embedding
Proceedings of the 25th international conference on Machine learning
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Semi-Supervised Learning
The Journal of Machine Learning Research
Nobody likes Mondays: foreground detection and behavioral patterns analysis in complex urban scenes
Proceedings of the 4th ACM/IEEE international workshop on Analysis and retrieval of tracked events and motion in imagery stream
Demystifying sparse rectified auto-encoders
Proceedings of the Fourth Symposium on Information and Communication Technology
Hi-index | 0.00 |
We propose a novel regularizer when training an autoencoder for unsupervised feature extraction. We explicitly encourage the latent representation to contract the input space by regularizing the norm of the Jacobian (analytically) and the Hessian (stochastically) of the encoder's output with respect to its input, at the training points. While the penalty on the Jacobian's norm ensures robustness to tiny corruption of samples in the input space, constraining the norm of the Hessian extends this robustness when moving further away from the sample. From a manifold learning perspective, balancing this regularization with the auto-encoder's reconstruction objective yields a representation that varies most when moving along the data manifold in input space, and is most insensitive in directions orthogonal to the manifold. The second order regularization, using the Hessian, penalizes curvature, and thus favors smooth manifold. We show that our proposed technique, while remaining computationally efficient, yields representations that are significantly better suited for initializing deep architectures than previously proposed approaches, beating state-of-the-art performance on a number of datasets.