Learning missing values from summary constraints

  • Authors:
  • Xintao Wu;Daniel Barbará

  • Affiliations:
  • UNC at Charlotte, Charlotte, NC;George Mason University, Fairfax, VA

  • Venue:
  • ACM SIGKDD Explorations Newsletter
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Real-world data sets often contain errors and inconsistency. Even though this is a very important problem it has received relatively little attention in the research community. In this paper we examine the problem of learning missing values when some summary information is available. We use linear algebra and constraint programming techniques to learn the missing values using apriori-known summary information and that derived from the raw data. We reconstruct the missing values by different methods in three scenarios: ideal-constrained, under-constrained, and over-constrained. Furthermore, for a range query involving missing values, we also give the lower bound and upper bound for the values using constraint programming techniques. We believe that theory of linear algebra and constraint programming constitutes a sound basis for learning missing values when summary information is available.