Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Elements of information theory
Elements of information theory
Machine Learning - Special issue on learning with probabilistic representations
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Discriminative, generative and imitative learning
Discriminative, generative and imitative learning
Discriminative versus generative parameter and structure learning of Bayesian network classifiers
ICML '05 Proceedings of the 22nd international conference on Machine learning
Expectation maximization algorithms for conditional likelihoods
ICML '05 Proceedings of the 22nd international conference on Machine learning
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers
The Journal of Machine Learning Research
Hi-index | 0.00 |
We introduce three discriminative parameter learning algorithms for Bayesian network classifiers based on optimizing either the conditional likelihood (CL) or a lower-bound surrogate of the CL. One training procedure is based on the extended Baum-Welch (EBW) algorithm. Similarly, the remaining two approaches iteratively optimize the parameters (initialized to ML) with a 2-step algorithm. In the first step, either the class posterior probabilities or class assignments are determined based on current parameter estimates. Based on these posteriors (class assignment, respectively), the parameters are updated in the second step. We show that one of these algorithms is strongly related to EBW. Additionally, we compare all algorithms to conjugate gradient conditional likelihood (CGCL) parameter optimization [1]. We present classification results for frame- and segment-based phonetic classification and handwritten digit recognition. Discriminative parameter learning shows a significant improvement over generative ML estimation for naive Bayes (NB) and tree augmented naive Bayes (TAN) structures on all data sets. In general, the performance improvement of discriminative parameter learning is large for simple Bayesian network structures which are not optimized for classification.