Exploiting Data Missingness in Bayesian Network Modeling

  • Authors:
  • Sérgio Rodrigues De Morais;Alex Aussem

  • Affiliations:
  • LIESP, INSA-Lyon, University of Lyon, Villeurbanne, France 69622;LIESP, UCBL, University of Lyon, Villeurbanne, France 69622

  • Venue:
  • IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a framework built on the use of Bayesian networks (BN) for representing statistical dependencies between the existing random variables and additional dummy boolean variables, which represent the presence/absence of the respective random variable value. We show how augmenting the BN with these additional variables helps pinpoint the mechanism through which missing data contributes to the classification task. The missing data mechanism is thus explicitly taken into account to predict the class variable using the data at hand. Extensive experiments on synthetic and real-world incomplete data sets reveals that the missingness information improves classification accuracy.