Incomplete Information in Relational Databases
Journal of the ACM (JACM)
Theory of linear and integer programming
Theory of linear and integer programming
Statistical analysis with missing data
Statistical analysis with missing data
Numerical recipes in C: the art of scientific computing
Numerical recipes in C: the art of scientific computing
A comparative study of algorithms for matrix balancing
Operations Research
Unknown attribute values in induction
Proceedings of the sixth international workshop on Machine learning
The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
IntelliClean: a knowledge-based intelligent data cleaner
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
Recovering Information from Summary Data
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Modeling and Imputation of Large Incomplete Multidimensional Datasets
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
On the Unknown Attribute Values in Learning from Examples
ISMIS '91 Proceedings of the 6th International Symposium on Methodologies for Intelligent Systems
Semantic and schematic similarities between database objects: a context-based approach
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient inference control for range SUM queries on statistical data bases
SSDBM'81 Proceedings of the 1st LBL Workshop on Statistical database management
OLAP over uncertain and imprecise data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
OLAP over uncertain and imprecise data
The VLDB Journal — The International Journal on Very Large Data Bases
Data mining research for customer relationship management systems: a framework and analysis
International Journal of Business Information Systems
On the influence of imputation in classification: practical issues
Journal of Experimental & Theoretical Artificial Intelligence
Supporting ranking queries on uncertain and incomplete data
The VLDB Journal — The International Journal on Very Large Data Bases
Journal of Intelligent Information Systems
Complementing data in the ETL process
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
CHASE2: rule based chase algorithm for information systems of type λ
AM'03 Proceedings of the Second international conference on Active Mining
Hi-index | 0.00 |
Real-world data sets often contain errors and inconsistency. Even though this is a very important problem it has received relatively little attention in the research community. In this paper we examine the problem of learning missing values when some summary information is available. We use linear algebra and constraint programming techniques to learn the missing values using apriori-known summary information and that derived from the raw data. We reconstruct the missing values by different methods in three scenarios: ideal-constrained, under-constrained, and over-constrained. Furthermore, for a range query involving missing values, we also give the lower bound and upper bound for the values using constraint programming techniques. We believe that theory of linear algebra and constraint programming constitutes a sound basis for learning missing values when summary information is available.