Introduction to algorithms
Implementing data cubes efficiently
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The C++ Programming Language, Third Edition
The C++ Programming Language, Third Edition
Table servers protect confidentiality in tabular data releases
Communications of the ACM
Software systems for tabular data releases
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Cached sufficient statistics for efficient machine learning with large datasets
Journal of Artificial Intelligence Research
Spatial and non-spatial model-based protection procedures for the release of business microdata
Statistics and Computing
Data confidentiality, data quality and data integration for federal databases
dg.o '05 Proceedings of the 2005 national conference on Digital government research
Data confidentiality, data quality and data integration for federal databases
dg.o '06 Proceedings of the 2006 international conference on Digital government research
Statistical confidentiality: Optimization techniques to protect tables
Computers and Operations Research
Verification servers: Enabling analysts to assess the quality of inferences from public use data
Computational Statistics & Data Analysis
A fast calculation of metric scores for learning Bayesian network
International Journal of Automation and Computing
Disclosure analysis for two-way contingency tables
PSD'06 Proceedings of the 2006 CENEX-SDC project international conference on Privacy in Statistical Databases
A generalization of the integer linear infeasibility problem
Discrete Optimization
Information Sciences: an International Journal
Hi-index | 0.00 |
Dissemination of information derived from large contingency tables formed from confidential data is a major responsibility of statistical agencies. In this paper we present solutions to several computational and algorithmic problems that arise in the dissemination of cross-tabulations (marginal sub-tables) from a single underlying table. These include data structures that exploit sparsity to support efficient computation of marginals and algorithms such as iterative proportional fitting, as well as a generalized form of the shuttle algorithm that computes sharp bounds on (small, confidentiality threatening) cells in the full table from arbitrary sets of released marginals. We give examples illustrating the techniques.