Constrained anonymization of production data: a constraint satisfaction problem approach

  • Authors:
  • Ran Yahalom;Erez Shmueli;Tomer Zrihen

  • Affiliations:
  • Deutsche Telekom Laboratories;Deutsche Telekom Laboratories and Department of Information Systems Engineering, Ben-Gurion University, Beer Sheva, Israel;Deutsche Telekom Laboratories and Department of Information Systems Engineering, Ben-Gurion University, Beer Sheva, Israel

  • Venue:
  • SDM'10 Proceedings of the 7th VLDB conference on Secure data management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The use of production data which contains sensitive information in application testing requires that the production data be anonymized first. The task of anonymizing production data becomes difficult since it usually consists of constraints which must also be satisfied in the anonymized data. We propose a novel approach to anonymize constrained production data based on the concept of constraint satisfaction problems. Due to the generality of the constraint satisfaction framework, our approach can support a wide variety of mandatory integrity constraints as well as constraints which ensure the similarity of the anonymized data to the production data. Our approach decomposes the constrained anonymization problem into independent sub-problems which can be represented and solved as constraint satisfaction problems (CSPs). Since production databases may contain many records that are associated by vertical constraints, the resulting CSPs may become very large. Such CSPs are further decomposed into dependant sub-problems that are solved iteratively by applying local modifications to the production data. Simulations on synthetic production databases demonstrate the feasibility of our method.