Robust Statistics Meets SDC: New Disclosure Risk Measures for Continuous Microdata Masking

  • Authors:
  • Matthias Templ;Bernhard Meindl

  • Affiliations:
  • Department of Methodology, Statistics Austria, Vienna, Austria 1110 and Department of Statistics and Probability Theory, Vienna University of Technology, Vienna, Austria 1040;Department of Methodology, Statistics Austria, Vienna, Austria 1110

  • Venue:
  • PSD '08 Proceedings of the UNESCO Chair in data privacy international conference on Privacy in Statistical Databases
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The aim of this study is to evaluate the risk of re-identification related to distance-based disclosure risk measures for numerical variables. First, we overview different - already proposed - disclosure risk measures. Unfortunately, all these measures do not account for outliers. We assume that outliers must be protected more than observations near the center of the data cloud. Therefore, we propose a weighting scheme for each observation based on the concept of robust Mahalanobis distances. We also consider the peculiarities of different protection methods and adapt our measures to be able to give realistic measures for each method. In order to test our proposed distance based disclosure risk measures we run a simulation study with different amounts of data contamination. The results of the simulation study shows the usefulness of the proposed measures and gives deeper insights into how the risk of quantitative data can be measured successfully. All the methods proposed and all the protection methods plus measures used in this paper are implemented in R-package sdcMicro which is freely available on the comprehensive R archive network (http://cran.r-project.org).