Advances in neural information processing systems 2
Neural networks and the bias/variance dilemma
Neural Computation
Speeding up backpropagation algorithms by using cross-entropy combined with pattern normalization
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality
Data Mining and Knowledge Discovery
Multitask learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
IEEE Transactions on Information Theory
CB3: An Adaptive Error Function for Backpropagation Training
Neural Processing Letters
Fusion of visual and infra-red face scores by weighted power series
Pattern Recognition Letters
An error-counting network for pattern classification
Neurocomputing
Deterministic neural classification
Neural Computation
Classification Using Improved Hybrid Wavelet Neural Networks
PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
On constructing parsimonious type-2 fuzzy logic systems via influential rule selection
IEEE Transactions on Fuzzy Systems
Hi-index | 0.00 |
Backpropagation, similar to most learning algorithms that can form complex decision surfaces, is prone to overfitting. This work presents classification-based objective functions, an approach to training artificial neural networks on classification problems. Classification-based learning attempts to guide the network directly to correct pattern classification rather than using common error minimization heuristics, such as sum-squared error (SSE) and cross-entropy (CE), that do not explicitly minimize classification error. CB1 is presented here as a novel objective function for learning classification problems. It seeks to directly minimize classification error by backpropagating error only on misclassified patterns from culprit output nodes. CB1 discourages weight saturation and overfitting and achieves higher accuracy on classification problems than optimizing SSE or CE. Experiments on a large OCR data set have shown CB1 to significantly increase generalization accuracy over SSE or CE optimization, from 97.86% and 98.10%, respectively, to 99.11%. Comparable results are achieved over several data sets from the UC Irvine Machine Learning Database Repository, with an average increase in accuracy from 90.7% and 91.3% using optimized SSE and CE networks, respectively, to 92.1% for CB1. Analysis indicates that CB1 performs a fundamentally different search of the feature space than optimizing SSE or CE and produces significantly different solutions.