Partial identification with missing data: concepts and findings

Authors:
Charles F. Manski
Affiliations:
Department of Economics and Institute for Policy Research, Northwestern University, 2001 Sheridan Road, Evanston, IL 60208, USA
Venue:
International Journal of Approximate Reasoning
Year:
2005

Citing 4
Cited 5

Robust Bayes classifiers

Artificial Intelligence
Robust Learning with Missing Data

Machine Learning
Updating beliefs with incomplete observations

Artificial Intelligence
Updating probabilities

Journal of Artificial Intelligence Research

Missing data imputation in breast cancer prognosis

BioMed'06 Proceedings of the 24th IASTED international conference on Biomedical engineering
Missing data imputation using statistical and machine learning methods in a real breast cancer problem

Artificial Intelligence in Medicine
Mark-recapture techniques in statistical tests for imprecise data

International Journal of Approximate Reasoning
Partially identified prevalence estimation under misclassification using the kappa coefficient

International Journal of Approximate Reasoning
WIMP: Web server tool for missing data imputation

Computer Methods and Programs in Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

The traditional way to cope with missing data problems has been to combine the available data with assumptions strong enough to point-identify the probability distribution describing a population. However, such assumptions often are not well motivated. An alternative approach is to first determine what may be inferred using the empirical evidence alone and then study the identifying power of credible assumptions. The generic result is that one may partially identify the probability distribution of interest: an identification region gives the set of distributions generated by combining the available data with all possible distributions of missing data. This expository article collects findings on partial identification with missing data. The focus is on identification of means, quantiles, and other parameters that respect stochastic dominance. It is shown how distributional assumptions using instrumental variables shrink the identification regions for these parameters. Findings are given on conditional prediction with missing data on outcomes or covariates.