Introduction to the theory of neural computation
Introduction to the theory of neural computation
A feedforward neural network with function shape autotuning
Neural Networks
Networks with trainable amplitude of activation functions
Neural Networks
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning Deep Architectures for AI
Foundations and Trends® in Machine Learning
Diagnostic of pathology on the vertebral column with embedded reject option
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Supervised and unsupervised co-training of adaptive activation functions in neural nets
PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
Neural network controller using autotuning method for nonlinear functions
IEEE Transactions on Neural Networks
Learning long-term dependencies with gradient descent is difficult
IEEE Transactions on Neural Networks
IEEE Transactions on Neural Networks
Hi-index | 0.10 |
Standard feedforward neural networks benefit from the nice theoretical properties of mixtures of sigmoid activation functions, but they may fail in several practical learning tasks. These tasks would be better faced by relying on a more appropriate, problem-specific basis of activation functions. The paper presents a connectionist model which exploits adaptive activation functions. Each hidden unit in the network is associated with a specific pair (f(.),p(.)), where f(.) is the activation function and p(.) is the likelihood of the unit being relevant to the computation of the network output over the current input. The function f(.) is optimized in a supervised manner, while p(.) is realized via a statistical parametric model learned through unsupervised (or, partially supervised) estimation. Since f(.) and p(.) influence each other's learning process, the overall machine is implicitly a co-trained coupled model and, in turn, a flexible, non-standard neural architecture. Feasibility of the approach is corroborated by empirical evidence yielded by computer simulations involving regression and classification tasks.