Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues

  • Authors:
  • Adrian Dobra;Alan F. Karr;Ashish P. Sanil

  • Affiliations:
  • National Institute of Statistical Sciences, Research Triangle Park, NC 27709-4006, USA;National Institute of Statistical Sciences, Research Triangle Park, NC 27709-4006, USA;National Institute of Statistical Sciences, Research Triangle Park, NC 27709-4006, USA

  • Venue:
  • Statistics and Computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dissemination of information derived from large contingency tables formed from confidential data is a major responsibility of statistical agencies. In this paper we present solutions to several computational and algorithmic problems that arise in the dissemination of cross-tabulations (marginal sub-tables) from a single underlying table. These include data structures that exploit sparsity to support efficient computation of marginals and algorithms such as iterative proportional fitting, as well as a generalized form of the shuttle algorithm that computes sharp bounds on (small, confidentiality threatening) cells in the full table from arbitrary sets of released marginals. We give examples illustrating the techniques.