Structural-EM for learning PDG models from incomplete data

  • Authors:
  • Jens D. Nielsen;Rafael Rumí;Antonio Salmerón

  • Affiliations:
  • Department of Computer Science, University of Castilla-La Mancha, Campus Universitario Parque Científico y Tecnológico s/n, 02071 Albacete, Spain;Department of Statistics and Applied Mathematics, University of Almería, La Caòada de San Urbano s/n, 04120 Almería, Spain;Department of Statistics and Applied Mathematics, University of Almería, La Caòada de San Urbano s/n, 04120 Almería, Spain

  • Venue:
  • International Journal of Approximate Reasoning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Probabilistic Decision Graphs (PDGs) are a class of graphical models that can naturally encode some context specific independencies that cannot always be efficiently captured by other popular models, such as Bayesian Networks. Furthermore, inference can be carried out efficiently over a PDG, in time linear in the size of the model. The problem of learning PDGs from data has been studied in the literature, but only for the case of complete data. We propose an algorithm for learning PDGs in the presence of missing data. The proposed method is based on the Expectation-Maximisation principle for estimating the structure of the model as well as the parameters. We test our proposal on both artificially generated data with different rates of missing cells and real incomplete data. We also compare the PDG models learnt by our approach to the commonly used Bayesian Network (BN) model. The results indicate that the PDG model is less sensitive to the rate of missing data than BN model. Also, though the BN models usually attain higher likelihood, the PDGs are close to them also in size, which makes the learnt PDGs preferable for probabilistic inference purposes.