Using classification methods to evaluate attribute disclosure risk

  • Authors:
  • Jordi Nin;Javier Herranz;Vicenç Torra

  • Affiliations:
  • CNRS, LAAS, Toulouse Cedex 4, France;Dept. Matemàtica Aplicada IV, Universitat Politècnica de Catalunya, Barcelona, Spain;Artificial Intelligence Research Institute, Spanish National Research Council, Catalonia, Spain

  • Venue:
  • MDAI'10 Proceedings of the 7th international conference on Modeling decisions for artificial intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Statistical Disclosure Control protection methods perturb the nonconfidential attributes of an original dataset and publish the perturbed results along with the values of confidential attributes. Traditionally, such a method is considered to achieve a good privacy level if attackers who try to link an original record with its perturbed counterpart have a low success probability. Another opinion is lately gaining popularity: the protection methods should resist not only record re-identification attacks, but also attacks that try to guess the true value of some confidential attribute of some original record(s). This is known as attribute disclosure risk. In this paper we propose a quite simple strategy to estimate the attribute disclosure risk suffered by a protection method: using a classifier, constructed from the protected (public) dataset, to predict the attribute values of some original record. After defining this approach in detail, we describe some experiments that show the power and danger of the approach: very popular protection methods suffer from very high attribute disclosure risk values.