Unsupervised Layer-Wise Model Selection in Deep Neural Networks

Authors:
Ludovic Arnold;Hélène Paugam-Moisy;Michèle Sebag
Affiliations:
Université Paris Sud 11 --CNRS, LIMSI, Ludovic.Arnold@lri.fr;Université de Lyon, TAO --INRIA Saclay, Helene.Paugam-Moisy@univ-lyon2.fr;TAO --INRIA Saclay, CNRS, LRI, Michele.Sebag@lri.fr
Venue:
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Year:
2010

Citing 12
Cited 0

Parallel neural computing based on network duplicating

Parallel algorithms
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Training products of experts by minimizing contrastive divergence

Neural Computation
A New Learning Algorithm for Mean Field Boltzmann Machines

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
No Unbiased Estimator of the Variance of K-Fold Cross-Validation

The Journal of Machine Learning Research
A fast learning algorithm for deep belief nets

Neural Computation
An empirical evaluation of deep architectures on problems with many factors of variation

Proceedings of the 24th international conference on Machine learning
Representational power of restricted boltzmann machines and deep belief networks

Neural Computation
Empirical Bernstein stopping

Proceedings of the 25th international conference on Machine learning
Curriculum learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Exploring Strategies for Training Deep Neural Networks

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Deep Neural Networks (DNN) propose a new and efficient ML architecture based on the layer-wise building of several representation layers. A critical issue for DNNs remains model selection, e.g. selecting the number of neurons in each DNN layer. The hyper-parameter search space exponentially increases with the number of layers, making the popular grid search-based approach used for finding good hyper-parameter values intractable. The question investigated in this paper is whether the unsupervised, layer-wise methodology used to train a DNN can be extended to model selection as well. The proposed approach, considering an unsupervised criterion, empirically examines whether model selection is a modular optimization problem, and can be tackled in a layer-wise manner. Preliminary results on the MNIST data set suggest the answer is positive. Further, some unexpected results regarding the optimal size of layers depending on the training process, are reported and discussed.