Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining

  • Authors:
  • Dan Zhu;Xiao-Bai Li;Shuning Wu

  • Affiliations:
  • Department of Logistics, Operations and MIS, Iowa State University, Ames, IA 50011, USA;College of Management, University of Massachusetts Lowell, Lowell, MA 01854, USA;ISO Innovative Analytics, 388 Market Street #750, San Francisco, CA 94111, USA

  • Venue:
  • Decision Support Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identity disclosure is one of the most serious privacy concerns in today's information age. A well-known method for protecting identity disclosure is k-anonymity. A dataset provides k-anonymity protection if the information for each individual in the dataset cannot be distinguished from at least k-1 individuals whose information also appears in the dataset. There is a flaw in k-anonymity that would still allow an intruder to discern the confidential information of individuals in the anonymized data. To overcome this problem, we propose a data reconstruction approach to achieve k-anonymity protection in predictive data mining. In this approach, the potentially identifying attributes are first masked using aggregation (for numeric data) and swapping (for nominal data). A genetic algorithm technique is then applied to the masked data to find a good subset of it. This subset is then replicated to form the released dataset that satisfies the k-anonymity constraint.