Structural-EM for learning PDG models from incomplete data

Authors:
Jens D. Nielsen;Rafael Rumí;Antonio Salmerón
Affiliations:
Department of Computer Science, University of Castilla-La Mancha, Campus Universitario Parque Científico y Tecnológico s/n, 02071 Albacete, Spain;Department of Statistics and Applied Mathematics, University of Almería, La Caòada de San Urbano s/n, 04120 Almería, Spain;Department of Statistics and Applied Mathematics, University of Almería, La Caòada de San Urbano s/n, 04120 Almería, Spain
Venue:
International Journal of Approximate Reasoning
Year:
2010

Citing 8
Cited 2

An improved Bayesian structural EM algorithm for learning Bayesian networks for clustering

Pattern Recognition Letters
Learning Belief Networks in the Presence of Missing Values and Hidden Variables

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
On the Representation of Probabilities over Structured Domains

CAV '99 Proceedings of the 11th International Conference on Computer Aided Verification
Naive Bayes models for probability estimation

ICML '05 Proceedings of the 22nd international conference on Machine learning
Supervised classification using probabilistic decision graphs

Computational Statistics & Data Analysis
Learning probabilistic decision graphs

International Journal of Approximate Reasoning
The Bayesian structural EM algorithm

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Approximating discrete probability distributions with dependence trees

IEEE Transactions on Information Theory

Modelling and inference with Conditional Gaussian Probabilistic Decision Graphs

International Journal of Approximate Reasoning
Refining a Bayesian Network using a Chain Event Graph

International Journal of Approximate Reasoning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Probabilistic Decision Graphs (PDGs) are a class of graphical models that can naturally encode some context specific independencies that cannot always be efficiently captured by other popular models, such as Bayesian Networks. Furthermore, inference can be carried out efficiently over a PDG, in time linear in the size of the model. The problem of learning PDGs from data has been studied in the literature, but only for the case of complete data. We propose an algorithm for learning PDGs in the presence of missing data. The proposed method is based on the Expectation-Maximisation principle for estimating the structure of the model as well as the parameters. We test our proposal on both artificially generated data with different rates of missing cells and real incomplete data. We also compare the PDG models learnt by our approach to the commonly used Bayesian Network (BN) model. The results indicate that the PDG model is less sensitive to the rate of missing data than BN model. Also, though the BN models usually attain higher likelihood, the PDGs are close to them also in size, which makes the learnt PDGs preferable for probabilistic inference purposes.