Protecting Privacy Against Record Linkage Disclosure: A Bounded Swapping Approach for Numeric Data

  • Authors:
  • Xiao-Bai Li;Sumit Sarkar

  • Affiliations:
  • Department of Operations and Information Systems, University of Massachusetts Lowell, Lowell, Massachusetts 01854;School of Management, University of Texas at Dallas, Richardson, Texas 75080

  • Venue:
  • Information Systems Research
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Record linkage techniques have been widely used in areas such as antiterrorism, crime analysis, epidemiologic research, and database marketing. On the other hand, such techniques are also being increasingly used for identity matching that leads to the disclosure of private information. These techniques can be used to effectively reidentify records even in deidentified data. Consequently, the use of such techniques can lead to individual privacy being severely eroded. Our study addresses this important issue and provides a solution to resolve the conflict between privacy protection and data utility. We propose a data-masking method for protecting private information against record linkage disclosure that preserves the statistical properties of the data for legitimate analysis. Our method recursively partitions a data set into smaller subsets such that data records within each subset are more homogeneous after each partition. The partition is made orthogonal to the maximum variance dimension represented by the first principal component in each partitioned set. The attribute values of a record in a subset are then masked using a double-bounded swapping method. The proposed method, which we call multivariate swapping trees, is nonparametric in nature and does not require any assumptions about statistical distributions of the original data. Experiments conducted on real-world data sets demonstrate that the proposed approach significantly outperforms existing methods in terms of both preventing identity disclosure and preserving data quality.