Bayesian regularization and pruning using a Laplace prior

Authors:
Peter M. Williams
Affiliations:
-
Venue:
Neural Computation
Year:
1995

Citing 0
Cited 37

The Equivalence of Support Vector Machine and Regularization Neural Networks

Neural Processing Letters
Error Functions for Prediction of Episodes of Poor Air Quality

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Non-retrieval: Blocking Pornographic Images

CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
On different facets of regularization theory

Neural Computation
Adaptive Sparseness for Supervised Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
The subspace information criterion for infinite dimensional hypothesis spaces

The Journal of Machine Learning Research
A maximum entropy approach to species distribution modeling

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximating the sheep milk production curve through the use of artificial neural networks and genetic algorithms

Computers and Operations Research
Hierarchical Bayesian Models for Regularization in Sequential Learning

Neural Computation
Second-Order Learning Algorithm with Squared Penalty Term

Neural Computation
Discriminative Random Fields

International Journal of Computer Vision
A cooperative constructive method for neural networks for pattern recognition

Pattern Recognition
Invariance priors for Bayesian feed-forward neural networks

Neural Networks
Preventing Over-Fitting during Model Selection via Bayesian Regularisation of the Hyper-Parameters

The Journal of Machine Learning Research
2007 Special Issue: Predictive uncertainty in environmental modelling

Neural Networks
On one method of non-diagonal regularization in sparse Bayesian learning

Proceedings of the 24th international conference on Machine learning
Modified constrained learning algorithms incorporating additional functional constraints into neural networks

Information Sciences: an International Journal
Using neural networks to model conditional multivariate densities

Neural Computation
Convergence of an online gradient algorithm with penalty for two-layer neural networks

MATH'06 Proceedings of the 10th WSEAS International Conference on APPLIED MATHEMATICS
Enhancing the generalization ability of neural networks through controlling the hidden layers

Applied Soft Computing
Generalization Error Estimation for Non-linear Learning Methods

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
A majorization-minimization algorithm for (multiple) hyperparameter learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Predicting MGMT Methylation Status of Glioblastomas from MRI Texture

MICCAI '09 Proceedings of the 12th International Conference on Medical Image Computing and Computer-Assisted Intervention: Part II
Pattern classification with class probability output network

IEEE Transactions on Neural Networks
A Least-squares Approach to Direct Importance Estimation

The Journal of Machine Learning Research
Probabilistic in silico prediction of protein-peptide interactions

RECOMB'05 Proceedings of the 2005 joint annual satellite conference on Systems biology and regulatory genomics
Expectation Propagation for microarray data classification

Pattern Recognition Letters
Recognition of cancer samples in the optical scattering spectra dataflow using multi-layer perceptron

BEBI'08 Proceedings of the 1st WSEAS international conference on Biomedical electronics and biomedical informatics
Estimating predictive variances with kernel ridge regression

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
On h∞ filtering in feedforward neural networks training and pruning

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Regularisation techniques for conditional random fields: parameterised versus parameter-free

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Transfer Metric Learning with Semi-Supervised Extension

ACM Transactions on Intelligent Systems and Technology (TIST)
Predictive neuro-control of uncertain systems: design and use of a neuro-optimizer

Automatica (Journal of IFAC)
Sample complexity of linear learning machines with different restrictions over weights

ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
Logistic regression with weight grouping priors

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Standard techniques for improved generalization from neuralnetworks include weight decay and pruning. Weight decay has aBayesian interpretation with the decay function corresponding to aprior over weights. The method of transformation groups and maximumentropy suggests a Laplace rather than a gaussian prior. Aftertraining, the weights then arrange themselves into two classes: (1)those with a common sensitivity to the data error and (2) thosefailing to achieve this sensitivity and that therefore vanish.Since the critical value is determined adaptively during training,pruning---in the sense of setting weights to exact zeros---becomesan automatic consequence of regularization alone. The count of freeparameters is also reduced automatically as weights are pruned. Acomparison is made with results of MacKay using the evidenceframework and a gaussian regularizer.