Robust Learning with Missing Data

  • Authors:
  • Marco Ramoni;Paola Sebastiani

  • Affiliations:
  • Children's Hospital Informatics Program, Harvard Medical School, Boston, MA 02115, USA. marco_ramoni@harvard.edu;Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA 01002, USA. sebas@math.umass.edu

  • Venue:
  • Machine Learning
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a new method, called the robust Bayesian estimator (RBE), to learn conditional probability distributions from incomplete data sets. The intuition behind the RBE is that, when no information about the pattern of missing data is available, an incomplete database constrains the set of all possible estimates and this paper provides a characterization of these constraints. An experimental comparison with two popular methods to estimate conditional probability distributions from incomplete data—Gibbs sampling and the EM algorithm—shows a gain in robustness. An application of the RBE to quantify a naive Bayesian classifier from an incomplete data set illustrates its practical relevance.