Using classification methods to evaluate attribute disclosure risk

Authors:
Jordi Nin;Javier Herranz;Vicenç Torra
Affiliations:
CNRS, LAAS, Toulouse Cedex 4, France;Dept. Matemàtica Aplicada IV, Universitat Politècnica de Catalunya, Barcelona, Spain;Artificial Intelligence Research Institute, Spanish National Research Council, Catalonia, Spain
Venue:
MDAI'10 Proceedings of the 7th international conference on Modeling decisions for artificial intelligence
Year:
2010

Citing 13
Cited 0

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Instance-Based Learning Algorithms

Machine Learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Induction of Decision Trees

Machine Learning
The Support Vector Method

ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks
Minimum Spanning Tree Partitioning Algorithm for Microaggregation

IEEE Transactions on Knowledge and Data Engineering
Probabilistic Information Loss Measures in Confidentiality Protection of Continuous Microdata

Data Mining and Knowledge Discovery
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Privacy Protection: p-Sensitive k-Anonymity Property

ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Rethinking rank swapping to decrease disclosure risk

Data & Knowledge Engineering
Record linkage for database integration using fuzzy integrals

International Journal of Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical Disclosure Control protection methods perturb the nonconfidential attributes of an original dataset and publish the perturbed results along with the values of confidential attributes. Traditionally, such a method is considered to achieve a good privacy level if attackers who try to link an original record with its perturbed counterpart have a low success probability. Another opinion is lately gaining popularity: the protection methods should resist not only record re-identification attacks, but also attacks that try to guess the true value of some confidential attribute of some original record(s). This is known as attribute disclosure risk. In this paper we propose a quite simple strategy to estimate the attribute disclosure risk suffered by a protection method: using a classifier, constructed from the protected (public) dataset, to predict the attribute values of some original record. After defining this approach in detail, we describe some experiments that show the power and danger of the approach: very popular protection methods suffer from very high attribute disclosure risk values.