Confidence intervals for probabilistic network classifiers

Authors:
M. Egmont-Petersen;A. Feelders;B. Baesens
Affiliations:
Utrecht University, Institute of Information and Computing Sciences, P. O. Box 80.089, 3508, TB Utrecht, The Netherlands;Utrecht University, Institute of Information and Computing Sciences, P. O. Box 80.089, 3508, TB Utrecht, The Netherlands;University of Southampton, School of Management, UK
Venue:
Computational Statistics & Data Analysis
Year:
2005

Citing 11
Cited 0

Active shape models—their training and application

Computer Vision and Image Understanding
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Introduction to Bayesian Networks

Introduction to Bayesian Networks
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality

Data Mining and Knowledge Discovery
A Tight Upper Bound on the Bayesian Probability of Error

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Error-Bars for Belief Net Inference

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Learning Bayesian Network Classifiers for Credit Scoring Using Markov Chain Monte Carlo Search

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
Statistical concepts

Intelligent data analysis
Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation

Management Science
Paper: On the quality of neural net classifiers

Artificial Intelligence in Medicine
Accurate object localization in gray level images using the center of gravity measure: accuracy versus precision

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.03

Visualization

Abstract

Probabilistic networks (Bayesian networks) are suited as statistical pattern classifiers when the feature variables are discrete. It is argued that their white-box character makes them transparent, a requirement in various applications such as, e.g., credit scoring. In addition, the exact error rate of a probabilistic network classifier can be computed without a dataset. First, the exact error rate for probabilistic network classifiers is specified. Secondly, the exact sampling distribution for the conditional probability estimates in a probabilistic network classifier is derived. Each conditional probability is distributed according to the bivariate binomial distribution. Subsequently, an approach for computing the sampling distribution and hence confidence intervals for the posterior probability in a probabilistic network classifier is derived. Our approach results in parametric bootstrap confidence intervals. Experiments with general probabilistic network classifiers, the Naive Bayes classifier and tree augmented Naive Bayes classifiers (TANs) show that our approximation performs well. Also simulations performed with the Alarm network show good results for large training sets. The amount of computation required is exponential in the number of feature variables. For medium and large-scale classification problems, our approach is well suited for quick simulations. A running example from the domain of credit scoring illustrates how to actually compute the sampling distribution of the posterior probability.