Statistical analysis with missing data
Statistical analysis with missing data
Machine Learning
Data mining
An introduction to database systems (7th ed.)
An introduction to database systems (7th ed.)
Effective Web data extraction with standard XML technologies
Proceedings of the 10th international conference on World Wide Web
Handling Missing Data in Trees: Surrogate Splits or Statistical Imputation
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Mining Imperfect Data: Dealing with Contamination and Incomplete Records
Mining Imperfect Data: Dealing with Contamination and Incomplete Records
Modern Applied Statistics with S
Modern Applied Statistics with S
Cleaning disguised missing data: a heuristic approach
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
DiMaC: a system for cleaning disguised missing data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
DiMaC: a disguised missing data cleaning tool
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Network-Based Analysis of Beijing SARS Data
BioSecure '08 Proceedings of the 2008 International Workshop on Biosurveillance and Biosecurity
Missing Values: Proposition of a Typology and Characterization with an Association Rule-Based Model
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Instance-based classifiers applied to medical databases: Diagnosis and knowledge extraction
Artificial Intelligence in Medicine
Recursive partitioning on incomplete data using surrogate decisions and multiple imputation
Computational Statistics & Data Analysis
Nearest neighbor selection for iteratively kNN imputation
Journal of Systems and Software
Information enhancement for data mining
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A new variable importance measure for random forests with missing data
Statistics and Computing
Hi-index | 0.00 |
Missing data is a well-recognized problem in large datasets, widely discussed in the statistics and data analysis literature. Many programming environments provide explicit codes for missing data, but these are not standardized and are not always used. This lack of standardization is one of the leading causes of the subtle problem of disguised missing data, in which unknown, inapplicable, or otherwise nonspecified responses are encoded as valid data values. Following a brief overview of the problem of explicitly coded missing data, this paper discusses sources, consequences, and detection of disguised missing data, including two real-world examples. As the first of these examples illustrates, the consequences of disguised missing data can be quite serious. The key to its detection lies in first, recognizing disguised missing data as a possibility and second, finding a sufficiently informative view of the data to reveal its presence.