On the Probabilistic Interpretation of Neural Network Classifiers and Discriminative Training Criteria

Authors:
Hermann Ney
Affiliations:
-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1995

Citing 10
Cited 11

Accelerated learning in layered neural networks

Complex Systems
Self-organization and associative memory: 3rd edition

Self-organization and associative memory: 3rd edition
Probabilistic neural networks

Neural Networks
The optimised internal representation of multilayered classifier networks performs nonlinear discriminant analysis

Neural Networks
Links between Markov models and multilayer perceptrons

Advances in neural information processing systems 1
Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Optimized Feature Extraction and the Bayes Decision in Feed-Forward Classifier Networks

IEEE Transactions on Pattern Analysis and Machine Intelligence
On the relations between discriminant analysis and multilayer perceptrons

Neural Networks
Principles of Digital Communication and Coding

Principles of Digital Communication and Coding

Minimum Cross-Entropy Approximation for Modeling of Highly Intertwining Data Sets at Subclass Levels

Journal of Intelligent Information Systems
Audio-Visual Speaker Recognition for Video Broadcast News

Journal of VLSI Signal Processing Systems
Combined Classification of Handwritten Digits Using the 'Virtual Test Sample Method'

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Holistic cursive word recognition based on perceptual features

Pattern Recognition Letters
Probabilistic Neural Network Based Method for Fault Diagnosis of Analog Circuits

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks, Part III
Optimal weighting of bimodal biometric information with specific application to audio-visual person identification

Information Fusion
Regularized margin-based conditional log-likelihood loss for prototype learning

Pattern Recognition
The use of neural network and discrete Fourier transform for real-time evaluation of friction stir welding

Applied Soft Computing
Visual quality recognition of nonwovens using generalized Gaussian density model and robust Bayesian neural network

Neurocomputing

Quantified Score

Hi-index	0.14

Visualization

Abstract

A probabilistic interpretation is presented for two important issues in neural network based classification, namely the interpretation of discriminative training criteria and the neural network outputs as well as the interpretation of the structure of the neural network. The problem of finding a suitable structure of the neural network can be linked to a number of well established techniques in statistical pattern recognition, such as the method of potential functions, kernel densities, and continuous mixture densities. Discriminative training of neural network outputs amounts to approximating the class or posterior probabilities of the classical statistical approach. This paper extends these links by introducing and analyzing novel criteria such as maximizing the class probability and minimizing the smoothed error rate. These criteria are defined in the framework of class-conditional probability density functions. We will show that these criteria can be interpreted in terms of weighted maximum likelihood estimation, where the weights depend in a complicated nonlinear fashion on the model parameters to be trained. In particular, this approach covers widely used techniques such as corrective training, learning vector quantization, and linear discriminant analysis.