Parallel neural computing based on network duplicating
Parallel algorithms
Training products of experts by minimizing contrastive divergence
Neural Computation
A New Learning Algorithm for Mean Field Boltzmann Machines
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
No Unbiased Estimator of the Variance of K-Fold Cross-Validation
The Journal of Machine Learning Research
A fast learning algorithm for deep belief nets
Neural Computation
An empirical evaluation of deep architectures on problems with many factors of variation
Proceedings of the 24th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Exploring Strategies for Training Deep Neural Networks
The Journal of Machine Learning Research
Hi-index | 0.00 |
Deep Neural Networks (DNN) propose a new and efficient ML architecture based on the layer-wise building of several representation layers. A critical issue for DNNs remains model selection, e.g. selecting the number of neurons in each DNN layer. The hyper-parameter search space exponentially increases with the number of layers, making the popular grid search-based approach used for finding good hyper-parameter values intractable. The question investigated in this paper is whether the unsupervised, layer-wise methodology used to train a DNN can be extended to model selection as well. The proposed approach, considering an unsupervised criterion, empirically examines whether model selection is a modular optimization problem, and can be tackled in a layer-wise manner. Preliminary results on the MNIST data set suggest the answer is positive. Further, some unexpected results regarding the optimal size of layers depending on the training process, are reported and discussed.