Handling missing data by using stored truth values

Authors:
G. H. Gessert
Affiliations:
Director, Corporate MIS, Primerica Corporation, 300 St. Paul Place, Baltimore, MD
Venue:
ACM SIGMOD Record
Year:
1991

Citing 0
Cited 11

Error propagation in distributed databases

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
A Hybrid Representation of Vague Collections for Distributed Object Management Systems

IEEE Transactions on Knowledge and Data Engineering
A Closed Approach to Vague Collections in Partly Inaccessible Distributed Databases

ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
Modelling uncertainty in multimedia database systems: an extended possibilistic approach

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Incomplete information in multidimensional databases

Multidimensional databases
Nulls, three-valued logic, and ambiguity in SQL: critiquing date's critique

ACM SIGMOD Record
POP algorithm: Kernel-based imputation to treat missing values in knowledge discovery from databases

Expert Systems with Applications: An International Journal
Kernel-Based Multi-Imputation for Missing Data

Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Optimized parameters for missing data imputation

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Missing value imputation based on data clustering

Transactions on computational science I
Missing data imputation by utilizing information within incomplete instances

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a method for handling inapplicable and unknown missing data. The method is based on: (1) storing default values (instead of null values) in place of missing data, (2) storing truth values that describe the logical status of the default values in corresponding fields of corresponding tables. Four valued logic is used so that the logical status of the default data values can be described as, not just true or false, but also as inapplicable or unknown. This method, in contrast to the “hidden byte” approach, has two important advantages: (1) Because the logical status of all data is represented explicitly in tables, all 4-valued operations can be handled via a 2-valued data manipulation language, such as SQL. Language extensions for handling missing data (e.g., “IS NULL”) are not necessary. (2) Because data fields always contain a default value (as opposed to a null value or mark), it is possible to do arithmetic across missing data and to interpret the logical status of the result by means of logical operations on the corresponding stored truth values.