Modeling and Imputation of Large Incomplete Multidimensional Datasets

Authors:
Xintao Wu;Daniel Barbará
Affiliations:
-;-
Venue:
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Year:
2002

Citing 9
Cited 7

Incomplete Information in Relational Databases

Journal of the ACM (JACM)
Statistical analysis with missing data

Statistical analysis with missing data
Unknown attribute values in induction

Proceedings of the sixth international workshop on Machine learning
Loglinear-Based Quasi Cubes

Journal of Intelligent Information Systems
The Management of Probabilistic Data

IEEE Transactions on Knowledge and Data Engineering
Induction of Decision Trees

Machine Learning
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
A Comparison of Several Approaches to Missing Attribute Values in Data Mining

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
On the Unknown Attribute Values in Learning from Examples

ISMIS '91 Proceedings of the 6th International Symposium on Methodologies for Intelligent Systems

Learning missing values from summary constraints

ACM SIGKDD Explorations Newsletter
Screening and interpreting multi-item associations based on log-linear modeling

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
OLAP over uncertain and imprecise data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
OLAP over uncertain and imprecise data

The VLDB Journal — The International Journal on Very Large Data Bases
Imputing time series data by regional-gradient-guided bootstrapping algorithm

ISCIT'09 Proceedings of the 9th international conference on Communications and information technologies
Two-phase imputation with regional-gradient-guided bootstrapping algorithm and dynamics time warping for incomplete time series data

ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing
A review and comparison of strategies for handling missing values in separate-and-conquer rule learning

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The presence of missing or incomplete data is a commonplace in large real-word databases.In this paper, we study the problem of missing values which occur at the measure dimension of data cube.We propose a two-part mixture model, which combines the logistic model and loglinear model together, to predict and impute the missing values. The logistic model here is applied to predict missing positions while the loglinear model is applied to compute the estimation. Experimental results on real datasets and synthetic datasets are presented.