Modeling and Imputation of Large Incomplete Multidimensional Datasets

  • Authors:
  • Xintao Wu;Daniel Barbará

  • Affiliations:
  • -;-

  • Venue:
  • DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The presence of missing or incomplete data is a commonplace in large real-word databases.In this paper, we study the problem of missing values which occur at the measure dimension of data cube.We propose a two-part mixture model, which combines the logistic model and loglinear model together, to predict and impute the missing values. The logistic model here is applied to predict missing positions while the loglinear model is applied to compute the estimation. Experimental results on real datasets and synthetic datasets are presented.