Synthetic data for small area estimation

  • Authors:
  • Joseph W. Sakshaug;Trivellore E. Raghunathan

  • Affiliations:
  • University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI

  • Venue:
  • PSD'10 Proceedings of the 2010 international conference on Privacy in statistical databases
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Increasingly, researchers are demanding greater access to microdata for small geographic areas to compute estimates that may affect policy decisions at local levels. Statistical agencies are prevented from releasing detailed geographical identifiers in public-use data sets due to privacy and confidentiality concerns. Existing procedures allow researchers access to restricted geographical information through a limited number of Research Data Centers (RDCs), but this method of data access is not convenient for all. An alternative approach is to release fully-synthetic, public-use microdata files that contain enough geographical details to permit small area estimation. We illustrate this method by using a Bayesian Hierarchical model to create synthetic data sets from the posterior predictive distribution. We evaluate the analytic validity of the synthetic data by comparing small area estimates obtained from the synthetic data with estimates obtained from the U.S. American Community Survey.