Statistical analysis with missing data
Statistical analysis with missing data
Practical Data-Oriented Microaggregation for Statistical Disclosure Control
IEEE Transactions on Knowledge and Data Engineering
Microdata Protection through Noise Addition
Inference Control in Statistical Databases, From Theory to Practice
Sensitive Micro Data Protection Using Latin Hypercube Sampling Technique
Inference Control in Statistical Databases, From Theory to Practice
Disclosure Risk Assessment in Perturbative Microdata Protection
Inference Control in Statistical Databases, From Theory to Practice
Spatial and non-spatial model-based protection procedures for the release of business microdata
Statistics and Computing
Maximum entropy simulation for microdata protection
Statistics and Computing
A theoretical basis for perturbation methods
Statistics and Computing
Information fusion in data privacy: A survey
Information Fusion
Hi-index | 0.00 |
We argue that any microdata protection strategy is based on a formal reference model. The extent of model specification yields "parametric", "semiparametric", or "nonparametric" strategies. Following this classification, a parametric probability model, such as a normal regression model, or a multivariate distribution for simulation can be specified. Matrix masking (Cox [2]), covering local suppression, coarsening, microaggregation (Domingo-Ferrer [8]), noise injection, perturbation (e.g. Kim [15]; Fuller [12]), provides examples of the second and third class of models. Finally, a nonparametric approach, e.g. use of bootstrap procedures for generating synthetic microdata (e.g. Dandekar et. al. [4]) can be adopted.In this paper we discuss the application of a regression based imputation procedure for business microdata to the Italian sample from the Community Innovation Survey. A set of regressions (Franconi and Stander [11]) is used for generating flexible perturbation, for the protection varies according to identifiability of the enterprise; a spatial aggregation strategy is also proposed, based on principal components analysis. The inferential usefulness of the released data and the protection achieved by the strategy are evaluated.