Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
An Experimental and Theoretical Comparison of Model SelectionMethods
Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
Machine Learning - Special issue on learning with probabilistic representations
A tutorial on learning with Bayesian networks
Learning in graphical models
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Model Selection Criteria for Learning Belief Nets: An Empirical Comparison
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Bayesian Error-Bars for Belief Net Inference
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Eighteenth national conference on Artificial intelligence
Learning Bayesian network classifiers by maximizing conditional likelihood
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Comparing Bayesian network classifiers
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
On supervised selection of Bayesian networks
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning Bayesian nets that perform well
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Discriminative vs. Generative Learning of Bayesian Network Classifiers
ECSQARU '07 Proceedings of the 9th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
K-Distributions: A New Algorithm for Clustering Categorical Data
ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
Learning locally weighted C4.4 for class probability estimation
DS'07 Proceedings of the 10th international conference on Discovery science
Improved mean and variance approximations for belief net responses via network doubling
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Feature selection for Bayesian network classifiers using the MDL-FS score
International Journal of Approximate Reasoning
Improving Tree augmented Naive Bayes for class probability estimation
Knowledge-Based Systems
Learning Bayesian network classifiers by risk minimization
International Journal of Approximate Reasoning
Learning naive bayes for probability estimation by feature selection
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Hi-index | 0.00 |
Bayesian belief nets (BNs) are often used for classification tasks, typically to return the most likely class label for a specified instance. Many BN-Iearners, however, attempt to find the BN that maximizes a different objective function - viz., likelihood, rather than classification accuracy - typically by first using some model selection criterion to identify an appropriate graphical structure, then finding good parameters for that structure. This paper considers a number of possible criteria for selecting the best structure, both generative (i.e., based on likelihood; BIC, BDe) and discriminative (i.e., Conditional BIC (CBIC), resubstitution Classification Error (CE) and Bias2+Variance (BV)). We empirically compare these criteria against a variety of different "correct BN structures", both real-world and synthetic, over a range of complexities. We also explore different ways to set the parameters, dealing with two issues: (1) Should we seek the parameters that maximize likelihood versus the ones that maximize conditional likelihood? (2) Should we use (i) the entire training sample first to learn the best parameters and then to evaluate the models, versus (ii) only a partition for parameter estimation and another partition for evaluation (cross-validation)? Our results show that the discriminative BV model selection criterion is one of the best measures for identifying the optimal structure, while the discriminative CBIC performs poorly; that we should use the parameters that maximize likelihood; and that it is typically better to use cross-validation here.