Learning process models with missing data

  • Authors:
  • Will Bridewell;Pat Langley;Steve Racunas;Stuart Borrett

  • Affiliations:
  • Computational Learning Laboratory, CSLI, Stanford University, Stanford, CA;Computational Learning Laboratory, CSLI, Stanford University, Stanford, CA;Computational Learning Laboratory, CSLI, Stanford University, Stanford, CA;Computational Learning Laboratory, CSLI, Stanford University, Stanford, CA

  • Venue:
  • ECML'06 Proceedings of the 17th European conference on Machine Learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we review the task of inductive process modeling, which uses domain knowledge to compose explanatory models of continuous dynamic systems. Next we discuss approaches to learning with missing values in time series, noting that these efforts are typically applied for descriptive modeling tasks that use little background knowledge. We also point out that these methods assume that data are missing at random—a condition that may not hold in scientific domains. Using experiments with synthetic and natural data, we compare an expectation maximization approach with one that simply ignores the missing data. Results indicate that expectation maximization leads to more accurate models in most cases, even though its basic assumptions are unmet. We conclude by discussing the implications of our findings along with directions for future work.