Model Based Disclosure Protection

Authors:
Silvia Polettini;Luisa Franconi;Julian Stander
Affiliations:
-;-;-
Venue:
Inference Control in Statistical Databases, From Theory to Practice
Year:
2002

Citing 5
Cited 4

Statistical analysis with missing data

Statistical analysis with missing data
Practical Data-Oriented Microaggregation for Statistical Disclosure Control

IEEE Transactions on Knowledge and Data Engineering
Microdata Protection through Noise Addition

Inference Control in Statistical Databases, From Theory to Practice
Sensitive Micro Data Protection Using Latin Hypercube Sampling Technique

Inference Control in Statistical Databases, From Theory to Practice
Disclosure Risk Assessment in Perturbative Microdata Protection

Inference Control in Statistical Databases, From Theory to Practice

Spatial and non-spatial model-based protection procedures for the release of business microdata

Statistics and Computing
Maximum entropy simulation for microdata protection

Statistics and Computing
A theoretical basis for perturbation methods

Statistics and Computing
Information fusion in data privacy: A survey

Information Fusion

Quantified Score

Hi-index	0.00

Visualization

Abstract

We argue that any microdata protection strategy is based on a formal reference model. The extent of model specification yields "parametric", "semiparametric", or "nonparametric" strategies. Following this classification, a parametric probability model, such as a normal regression model, or a multivariate distribution for simulation can be specified. Matrix masking (Cox [2]), covering local suppression, coarsening, microaggregation (Domingo-Ferrer [8]), noise injection, perturbation (e.g. Kim [15]; Fuller [12]), provides examples of the second and third class of models. Finally, a nonparametric approach, e.g. use of bootstrap procedures for generating synthetic microdata (e.g. Dandekar et. al. [4]) can be adopted.In this paper we discuss the application of a regression based imputation procedure for business microdata to the Italian sample from the Community Innovation Survey. A set of regressions (Franconi and Stander [11]) is used for generating flexible perturbation, for the protection varies according to identifiability of the enterprise; a spatial aggregation strategy is also proposed, based on principal components analysis. The inferential usefulness of the released data and the protection achieved by the strategy are evaluated.